Hi, I'm a Data Scientist working at The Climate Corporation working with Geo-Spatial Data. I have been involved in developing machine learning models and algorithms to translate insights into business opportunities.
• Analyzed Geospatial agriculture planter pass data, harvester pass data, planter & harvester equipment data to study grower practices, using Python, data visualization tools. • Worked on complete life-cycle of the Outcome-Based-Pricing (OBP) model: Worked on defining protocol rules for 2020, develop algorithms, strategy to productionize algorithms, and approve Engineering model for deployment, and model improvements for 2021+. • Collaborated with Data teams, Product teams, and Engineering teams for the above. • Performed feature engineering, model selection, and hyperparameter optimization to predict crop yield using ML models like Nearest Neighbor Gaussian Process (NNGP) and Linear models. • Developed Supervised learning models to predict the best experimental areas. • Collected data from different sources like AWS S3, Hadoop File System (HDFS), Internal Production APIs, to build datasets for developing, testing and validating algorithms. • Created large samples of fields to scale OBP algorithms and pilot runs, with Spark SQL, using Stratified and Random sampling. • Created and optimized ETL PySpark pipelines to scale the OBP algorithm, on a large number of fields using AWS for testing and validation, for North American and South American fields. • Developed and scaled algorithms to improve the performance of OBP models for 2021+. • Wrote and Reviewed code according to best coding practices. Worked with git version-controlled-systems. Wrote reports on Confluence. Managed tasks with JIRA. • Coached and trained Jr. Data Scientist.
• Analyzed stock price data with Technical analysis. Developed a derivatives trading algorithm using Python. Collected data using Robinhood API. Outperforms the S&P 500 by about 5%.