Yidan holds a Master's degree in Business Analytics from Carlson School of Management and Bachelor's degree in Supply Chain & Information Analytics from Purdue University. She is an experienced data professional with 2 years of experience in model development and analytical consultant. Her strengths lie in collaborating with clients or business leaders to understand their business objectives and envisioning a solution to address these problems. She has proficient experience in statistics, predictive and prescriptive modeling, probability distributions, visualization, and data migration and warehousing. Having worked in the analytics field since undergrad, she has a strong track record of solving complex problems and deriving data-driven decisions with industry collaborations. Having developed a great exposure to analytics based on her industry and educational experience, she is passionate and capable to devote with different functionality.
Experienced in Python, SQL, R, Tableau, Power BI and big data environments of AWS, Hive, Hadoop and Spark.
Seeking full time positions, starting June 2020. Please reach via LinkedIn or at gaoxx844@umn.edu
-
Experience
Designed NoSQL Database for Non-profit Governmental Agency Jan 2020 – May 2020
• Conducted disparity analysis in PowerBI to identify under-served groups and documented using Confluence
• Used linear regression in Python Scikit-learn to predict wellbeing among demographic groups, optimized model to allocate limited resources of food shelves and proceed progress as scrum master using Jira
Drove Up Headcounts for Leading Hospitality and Entertainment Company
Oct 2019 – Dec 2019
• Clustered 2M customers into 4 segments with K-Means in Scikit-learn and identified underserved groups by setting KPI and visualized findings in PowerBI
• Designed promotion bundles with association rule to drive up headcounts by 26% in R
Optimized Staff Planning for client Mall of America
Jun 2019 – Sep 2019
• Differentiated important holidays with Uplift regression and ggplot in R to support workforce management.
• Managed 40K data points on call center data to optimize staffing roster. Identified trends in data by time, location, type, weather, and visualized data in PowerBI, created technical report in R markdown, presented to leadership
CHINA TELECOM
Marketing Analyst Jun 2018 – Sep 2018
• Led 5-member cross functional team to retrieve 160K user records using SQL to detect transaction patterns. Defined key metrics to analyze user retention rate and optimize existing profit model
• Derived actionable insights on customer affinity toward E-pay online payment system using Tree-based models in Python with ~87% accuracy, resulting in 0.4% YoY growth in province
Information Analyst Jun 2017 – Sep 2017
• Extracted 100K user web search data using SQL on Hive to present insights on data flow trends. Calculated metrics such as page views, duration, and conversion rate for exploratory analysis
• Deployed ETL pipeline and built Logistic Regression and Decision Tree with 75% accuracy in Python Scikit-learn to understand demographic portraits of users and helped implement promotional plan
-
Projects
Music Recommendation with Spark MLlib
• Retrieved and aggregated listening records using Spark SQL. Visualized the user listening patterns by converting data frame to RDD and plot with PySpark Matplotlib
• Improved big data computation efficiency by 30% with Hadoop file system MapReduce framework.
Cloud Computing with AWS EMR
• Computed word frequency of Twitter text in EMR Cluster by retrieving data using SQL query in Hue. Identified hourly traffic, top 10 keywords, and top 3 IPs by number of visits by developing computing instances in EC2 and storing results in S3
Snowflake Hackathon Competition (Top 4)
• Defined schema and developed granular data using SQL query to merged external data sources of LA restaurant inspections and violations records into Snowflake data warehouse
• Built executive report in Tableau by putting context, and thinking of nuances in the target users and linking visuals to decisions.