Data Analyst, Costco, Seattle April 2020 - June 2020 This was a capstone project which required working on Costco data and was supervised by Costco's director of analytics and business intelligence. ►Analyzed sales data to solve the problem of product overstocking and understocking by predicting the accurate stock of products to help in the inventory management using python. ►Implemented hierarchical based clustering to predict demand of products using Fbprophet forecasting and achieved 96% accuracy for the average units predicted daily. ►Performed market basket analysis, customer segmentation based on profits and exploratory data analysis to extract the insights.
Data Analytics Automation Intern, SEED&SPARK, Seattle January 2020 - March 2020 The Experiential Network (XN) is a unique initiative built to help graduate students engage in experiential learning opportunities with Northeastern University's real-world partners. Highlights: Automated the consolidation of multiple files data set to create a unified database, can take in N number of input files, works on a single push button using GUI. ►Performed data cleaning on approx. 500 excel files, extracted similar features from different named columns from input files, each excel file represented info about a movie and combined them all in one excel master database file using python. ►Solved the problem of manually combining all the files into one file, integrated all the functionality of data cleaning and extracting into a GUI created using python Tkinter package . ►This automated data analysis resulted in saving days of error prone manual work.
Senior Data Analyst, THREEXFIVE CONSULTANCY, India Oct 2016 – Aug 2017 ► Designed & Implemented an Unsupervised Learning-based ML solution in python to solve the problem of the incorrect and inefficient location of service delivery centers for a public sector organization in a developing nation. ► Citizens from the nearby areas were traveling to these centers for accessing various public level services incurring travel costs and time. With the usage of techniques like Fuzzy Clustering and Heuristic algorithms, identified the correct locations of these service centers as compared to the existing location of these centers. ►With the solution, was able to optimize the cost & time of service access for citizens from these delivery centers by up to 20%.
Data Analyst, THREEXFIVE CONSULTANCY, India Apr 2015 – Sep 2016 ► Managed large data sets and perform detailed analysis while completing data cleaning activities regarding large amounts of survey data to identify critical areas of interest using Python. ► Analyzed client’s data in Tableau, translated complex data and analysis into easily understood metrics.
Predicting House Price in Sammamish, WA (Real world dataset) ►Performed data preparation, feature engineering on raw data and found the best features to build a Random Forest Regressor model to predict house prices in Washington with an accuracy rate of 93%. ►This dataset set is a real-world data set for the Sammamish region and can be utilized by real estate agents to predict house prices.
HDMA Washington State Home Loan analysis ►Performed data visualization and exploratory data analysis using seaborn, matplotlib, and pandas in python. ►Proposed various suggestions to get maximum home loan approval rate in Washington state based on the extracted insights.
Sales Data visualization using Apache Hive and Apache Spark ►Analyzed, Visualized and provided advanced insights/suggestions on a retail chain company data using distributed analytical processing technologies like Apache Spark for data cleaning and Apache Hive to get insights.
Analyzed Global Deforestation trend using SQL, Python and Tableau ►Performed analysis on global deforestation data, wrote complex SQL queries using joins, grouping, aggregation, windows functions, case statements and nested subqueries and provided analysis on results obtained.