Experienced Senior Engineer with hands-on experience in data and resource management. An agile, result-oriented and insightful Data Analyst skilled in Python, R, and SQL. Uses Machine Learning algorithms to solve real-world problems dataset by dataset. My passion for Data Science comes from its core value, that is to stop guessing and make informed decisions. Numbers talk louder than words and that is what I aim to bring through my skillset.
I have demonstrated history of working in the food & beverages industry which promotes my synergetic and goal-oriented behaviour.
My skill-set includes:
Programming: Python [Pandas | Numpy | Scikit-Learn | Tensorflow | Keras], R [dplyr | ggplot2], SQL
Machine Learning Algorithms: Neural Network, Decision Trees, Random Forest, Ensemble B&B, Regression and Classification, KNN, SVMs, Deep Learning, Convolutional Neural Networks, Recurrent Neural Networks, Reinforcement Learning
Statistical Techniques: Regression Analysis, Time Series Forecasting, Hypothesis Testing, Stochastic Processes, and Gradient Descent
Applications & Databases: Microsoft Office Suite, Git, GitHub, Minitab, Tableau, SPSS, Microsoft Access
Engineering: Quality Control, Manufacturing, Process Engineering, Mechanical, and Maintenance Engineering
-
Experience
1. Teaching Assistant (Dec 2019-May 2020 & Aug 2020-Present)
Performing academic tutoring in Industrial Engineering 342 (Probability and Statistics) course to 104 students
2. Research Assistant (May 2020-Aug 2020)
Analyzing the effect of introduction of Augmented Reality (AR) interface as a data presentation method for data entry task
3. Senior Engineer (Jul 2017-May 2019)
Analyzing the performance of various machines and their components to minimize breakdown time and maximize production
-
Projects
SIIM-ISIC Melanoma Classification (Jun 2020-Aug 2020)
1. This was a computer vision competition hosted by Kaggle and sponsored by the SIIM-ISIC community.
2. This competition required identifying Melanoma in images of skin lesions. In particular, using images within the same patient and determining which are likely to represent a melanoma.
3. Finetuned Efficient B0-B7 and Inception ResNet v2 for different image sizes of the skin lesion (256, 384 & 512). Performed a 5-fold startified cross validation to identify the best performing architecture. Performed heavy data augmentation techniques on training (rotation, cropping, shear, gridmasking, etc.) and testing data (test time augmentation).
4. Final optimal model was a blend of EfficientNet architecture models that gave a score (AUC) of 0.9383.
5. Placed in the top 10% of the participants in the Private Leaderboard and won a bronze medal.
M5 Forecasting - Accuracy (May 2020-Jun 2020)
1. Use hierarchical sales data from Walmart, the world’s largest company by revenue, to forecast daily sales for the next 28 days. The data, covers stores in three US States (California, Texas, and Wisconsin) and includes item level, department, product categories, and store details. In addition, it has explanatory variables such as price, promotions, day of the week, and special events.
2. Implemented various Machine Learning algorithms to check which model worked the best. Techniques implemented were RNN (LSTM), XGBoost, and LightGBM.
3. The loss (WRMSE) achieved for the testing dataset from these models was 0.9.