My name is Tianle Zhu and I will complete my MA in Statistics from Columbia University end of this year. I am looking for an entry-level Data Scientist or Data Analyst position, ideally in the tech field. I have a strong background in mathematics and statistics. Moreover, I am proficient in Python and R. I have experience in statistical modeling, predictive modeling, data cleaning/ visualization, feature engineering, and machine learning.
-
Experience
Statistical Consultant, UCLA Health 06/2018 – 06/2019
• Performed statistical analyses for the campus health clinic.
• Merged datasets using R and used Tableau to get graphic analysis.
• Developed conclusions using statistical methodologies including descriptive analysis, hypothesis testing, logistic regression, longitudinal modeling, and Cox proportional hazards modeling.
• Assisted to publish a paper on the Journal of Neuroimmune Pharmacology.
Data Analyst Volunteer, UCLA Biostatistics 10/2018 – 06/2019
• Coordinated the data management aspects of the health research laboratory.
• Organized and cataloged data using statistical software Access.
• Oversaw database system management, including maintenance, quality assurance, and reporting and analysis functionality.
• Proactively detected and corrected errors to ensure accuracy in all datasets.
-
Projects
Deep Video Understanding ACM Multimedia 2020 Grand Challenge 05/2020-07/2020
• Used Google Cloud’s speech-to-text API to generate text corresponding to all sound utterances.
• Detected video shot and scene boundary detection by using a color histogram.
• Extracted a 128-dimensional audio embedding for each second of audio for each speaker as a semantically compact representation by using the VGG-ish model.
• Labeled the relationship between speakers in the video. Implemented a supervised model, trained on a dual-network using Audio embeddings.
Medical Image Diagnosis, Columbia University 02/2020-05/2020
• Designed a two-step image diagnosis pipeline which consists of classification and localization and used transfer learning to fine-tune the model.
• Implemented classification algorithms VGG16 and ResNet50 which achieves 94% accuracy.
• Deployed localization algorithms YOLO for object detection and UNet for segmentation which results in AUC 0.94.
Generative Adversarial Network (GAN), Columbia University 11/2019-12/2019
• Analyzed the impact of hyperparameters include epoch, optimizer, number of layers and neurons, different forms of the loss function, and regularizations on the performance and convergence properties of the GAN models.
• Compared GAN and WGAN by training and validating them on the MNIST and SVHN datasets, respectively.
• Transferred Tensorflow 1.0 code to 2.0 to implement and customize GAN models.
• Visualized model results to check if the GAN models were generating realistic images.
Wine Quality Prediction, UCLA 01/2018-03/2018
• Constructed a multi-linear regression model to predict wine quality given 70% of the data set
• Implemented machine learning approaches including regression, classification, and clustering
• Improved the original full model and reduced the predictive error by 62.8% using regression analysis