I am a recent graduate from George Mason University with Masters in Data Analytics Engineering. I am looking for full-time opportunities in data science, analytics, deep learning or software engineering. I have experience of ETL and data analysis using SAS Base, SQL and Excel. I was research assistant for a year working on sensitivity analysis of neural networks. I am interested in problem-solving using analytical and machine learning skills.
-
Experience
Data Analytics Intern at Bakertilly
• Building an end-to-end system which performs ETL (Extract, transform and load) and data extraction using Microsoft Azure data factory and Azure machine learning studio.
• Understanding and implementing the end to end process of tax credit calculations, data extraction using Optical Character Recognition.
• Designed a user interface, relational database, and used power BI to generate the reports.
• Creation of rules engine to flag the deviation in the data and consolidating tax data.
Software Engineer-Data at Accenture Solutions
• Part of Business Intelligence team working on an ETL (Extract, Transform and Load) tool using SAS Base language and Data warehousing in banking domain.
• Analyze the data problems such as missing data in some accounts, imbalanced reports, using SAS Base and SQL.
• Understanding and analyzing the data reconciliation, data flows and pipelines, financial data reporting
• Experience of working in Agile environment, communicate the results and analyses with business.
• Perform tasks such as data processing, reporting, analysis using MS Excel, attending scrum calls.
-
Projects
High Utilization Prediction
• Exploratory Data Analysis of claims using python and feature extraction using sliding window concept.
• Prediction of high utilizers using Logistic regression, random forest, and LSTM.
• Comparison of models, hyper parameter tuning and sensitivity analysis of models.
Using crowdsourcing to improve forecasting capability of a Prediction Market
• Designed an algorithm to improve the forecasts of the prediction market (known as SciCast) and designed plotly dashboard for the market performance.
• The algorithm involved identifying and extracting features from empirical data and using them for data modeling.
• Comparing the mean absolute errors of different models, Random forest regression achieved significantly lower MAE than the Baseline
Time-series analysis and Forecasting
• Statistical Analysis of stationarity of the stock prices using hypothesis tests.
• Converting non-stationary time-series to stationary and used ARIMA to forecast the stock price.
• Stationary time-series provides least mean squared error and root mean squared error.
MIMIC Data analysis and mortality prediction
• Exploratory Data Analysis of MIMIC Data to predict mortality of a patient during hospital stay using python and SQL.
• Sliding window method was used to extract data features for the model training and testing.
• Compared logistic regression models with different window sizes to get most accurate prediction.