I torture data for my living. Optimistic. Ambitious. Foodie. Travel Enthusiast
Associate Data Scientist at Hansa Cequity, Mumbai Responsible for providing analytics solutions to a banking & a media domain client. Being a member of the Predictive Modeling team at Hansa Cequity, I got a chance to build Purchase Propensity Models, Churn Models (Temporary & Permanent), Time Series Forecasting Models & Recommendation Models for the clients from domains like Banking & Financial Services and Media. I worked across each & every stage of the ML lifecycle right from defining the problem to model deployment. These models were able to capture more than 70% of the events in the top 3 deciles. I also created codes for the monthly scoring & tracking of the models. I was awarded the Falcon Award by the CEO for among the top-performing new recruits.
1) Customer Churn Model: - To Predict whether a de-activated customer will Churn. Providing a Model Lift of 2.3, managed to slash the churn rate by 25% by enabling the contact-center to target the top 3 decile high propensity customers. Involved EDA, feature engineering, continuous variables categorization, feature selection using information value and weight of evidence, evaluation & validation on R using the Gains & Lifts Chart, AUC & KS Statistic.
2) Forecasting: - Predicting no. of Promo Calls to be made for each product in the next 3 months using Time Series. Used Box-Jenkins Methodology (ARIMA) for forecasting. The forecast enabled an optimal allocation of the contact-center executives thereby minimizing the resource ideal time as well as resource shortage.
3) Market Basket Analysis: - For the Recommendation of the Media Packs/Plans. Obtained the top Association rules, by setting a minimum threshold for support & confidence. Implemented customer base profiling using decision tree for all the association rules wherein an antecedent had multiple consequents, to recommend the most likely pack a customer can purchase.
4) Loan Purchase Propensity Model: - To Predict the Convertibility of the Leads. Obtaining a model lift of 2.5, attained the leads targeting goal of the client. Involved EDA, data cleaning, categorizing continuous variables, feature selection using information value, crossvalidation on R and evaluation using the Gains & Lifts Chart & AUC. Classified leads into Hot, Warm, Cold by setting the probability thresholds for each bank branch. Completed Intime & Out of time Model Validation. Deployed the model query in the real-time using SQL job scheduling for an automated lead scoring, Test & Control Approach for the monthly tracking of the model performance.
5) Sentiment Analysis: Detecting Hate Speech on Twitter using various Natural Language Processing techniques like Stemming, Lemmatization & Stop-words Removal, Vectorization techniques like Bag of Words, Tf-Idf, Word2vec & classification using Naïve Bayes, Support Vector Machines, K-Nearest Neighbors & Artificial Neural Networks.
6) Bank Stock Returns Prediction (Ongoing): I am trying to challenge the existing CAPM & Market risk models for the stock returns prediction. This project involves predicting the annual stock return of a given bank using Clustering and subsequent model building on each cluster of banks. The dataset contains stock returns data of several banks over the last two decades.