Introduction
The sports industry is highly competitive with low threat of new entrants and requiring a significant amount of capital, human as well as financial, which makes it difficult to enter the market. Additionally, the time, effort, and resources required to gain customer's trust and establish a brand make it difficult for new entrants to differentiate themselves against already established players.
Under Armour did a phenomenal job establishing itself well in the market against Nike and Adidas. From my research report, I anticipate core profit margin (core-PM) to increase as UAA has been advised to reduce inventory through off-campus channels; and UAA to invest in superior product innovation through Connected Fitness. UAA is expected to increase inventory efficiency, invest in superior technology product innovations; hence be able to generate more $ in sales for every $ in assets.
Through my data analysis report, I want to evaluate the likelihood of UAA being successful to generate more $ in sales for every $ in assets, and I do so by building models on data available from Compustat on SP 1500 companies. I have used heavy RD, possible product expansion, new technology, past high sales growth as my predictor variables to predict ATO.
Models
1. The first model evaluates RD investment, past sales, ato, investment in assets and market concentration as predictor variables to forecast ATO in year t+1
ATO _{t+1 }= α + β_{1 }per xrd + β_{2 }inc xrd + β_{3 }avg xrd + β_{4 }salesgr + β_{5 }salesgr count + β_{6 }ato gr + β_{7} avg at + β_{8 }per at + β_{9 }hhi
2. The second model forecasts with past sales and ATO, and three variables focused around RD:
ATO_{t+1 }= α + β_{1 }incr xrd + β_{2 }sales gr + β_{3} ato + β_{4} per xrd + β_{5 }avg xrd
3. The third model uses ensemble methods: boosting technique to predict ATO
ATO _{t+1 }= α + β_{1 }per xrd + β_{2 }inc xrd + β_{3 }avg xrd + β_{4} salesgr + β_{5 }salesgr count + β_{6 }ato gr + β_{7 }avg at + β_{8 }per at + β_{9 }hhi
4. The fourth model incorporates polynomials and interactions among them to predict ATO in year t+1
ATO_{t+1 }= α + β_{1 }incr xrd + γ avg xrd ∗ per xrd + β_{2 }salesgr count + δ avg at ∗ per at + β_{3 }ato + β_{4 }ato pr
5. The fifth model runs KNN regression and filters by industry code to predict ato for the year t+1
pnfit-lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp)
Data
The data is from Compustat from 1991 through 2017 for all SP 1500 companies. To avoid outliers, I have winsorized the variables and then used winsorized data to feed my prediction model. Any value less than 2nd percentile is set to 2nd percentile and values greater than 98th percentile is set to the 98th percentile. Table 1 in Appendix defines the individual variables used.
I have chosen these variables because I feel they sufficiently capture the features I expect to predict ATO. Under Armour wants to differentiate itself through superior product innovation through Connected Fitness. To come up with unique and innovative ways of differentiating itself against the competitors, I anticipate continued high investment in RD. To further evaluate the hypothesis, I have performed an evaluation on RD against three metrics: (i) RD investment of the company as a percentage of the industry RD (ii) Investment growth in RD over past 5 years (iii) Average RD over past 5 years
Under Armour successfully managed to establish itself and earn a substantial market share in an industry that has a high competitive rivalry with Nike and Adidas as established leaders. Hence, I have used past sales growth and Herfindahl index as a predictor variable for the model. (i) sales growth over the past year (ii) count of sales over the past 5 years (iii) Herfindahl index, HHI
The sportswear industry is capital intensive and considerations for product expansion can require further investment in assets, and hence I have used investments in assets as a predictor metric in my model (i) average investment in assets (ii) percentage of investment in capital over industry investment in capital
Further, I want to use past ATO as a predictor for future ATO. For that I defined predictor variables to capture (i) ATO for the current year, t (ii) ATO for the past year, t-1 (iii) magnitude of ATO change in the past one year (iv) count variable to capture ATO growth over past 5 years
Results
Out of the various models I tried and tested, I am highlighting results where I could derive the most understanding from evaluating the results, and in my opinion, worked best for my analysis.
Model 1: Results from linear model to predict ATO for year t+1
Table 3 presents results from linear model to predict ATO for the year t+1. The data for this model is the winsorized set of predictor variables capturing RD investment, past sales, ATO, investment in assets, and market concentration. The results highlight per_xrd which is the percentage of company’s investment in RD over total investment in RD within the industry, inc_xrd - a count variable to evaluate the increase in RD YoY over past 5 years, avg_xrd average investment in RD, avg_at average investment in total assets, per_at percentage of investment in capital over industry investment in capital, and hhi Herfindahl index as statistically significant variables.
The results highlight a positive correlation between per_xrd and ato_{t+1}, negative correlation between incr_xrd and ato_{t+1}and positive correlation between avg_xrd and ato_{t+1}. I interpret the results as RD, in general, having a strong influence on determining the asset turnover, and this eventually enables product differentiation through high and consistent investment in RD. I would also interpret the results of a negative correlation between incr_xrd and ato_{t+1} as an investment in RD may not immediately translate into higher sales or optimized use of existing assets.
The results also show that avg_at and ato_{t+1} are positively correlated, and per_at and ato_{t+1} are strongly positively related. Based on my knowledge and understanding, the results of analysis from the general data set fit well for predictions within the sportswear industry, which is capital intensive. A consistent and significant amount of investments in assets could result in higher sales would provide a platform for the company to differentiate itself, leverage economies of scale, and establish itself as a brand. I interpret the negative correlation between hhi and ato_{t+1} as despite heavy investment in RD and efforts towards product differentiation if the market is extremely concentrated and competitive, it would be difficult to increase sales and hence having an inverse relationship with future ATO.
Model 2: Results of prediction with forecasts with past sales and ATO, and three variables focused around RD
Table 4 depicts results of prediction with forecasts with past sales and ATO, and three variables focused around RD. The results reinforce the analysis and interpretation from model 1 on the influence of RD with incr_xrd, per_xrd, and avg_xrd as statistically significant variables, and highlight an inverse relation between present ATO and ato_{t+1}. On initial analysis, this inverse relation is counter-intuitive to my understanding.
Model 3: Ensemble methods - Boosting
Table 5 presents the results from incorporating the majority of the predictor variables in boosted tree fit to drive projection for ATO for year t+1. The results highlight the relative influence of each of the predictor variables on the outcome variable. The results show that the percentage of total assets compared to the total investment in assets plays the most crucial role in driving asset turnover, with influence being as high as 50.26%. This is followed by average investment in RD with an influence of 25% and average investment in assets with an influence of 10.27% followed by the influence of industry concentration of 9.71%.
On a deeper analysis of results, I feel since I took the entire data set is taken into consideration the general picture may not be a reflection of my specific industry. However, the insights do provide a reference point for further analysis and an overall big picture to start with as I seek to narrow down towards industry-specific results.
Table 6 provides a graph for illustration purposes to be able to visualize the influence in a better manner.
Model 4
Table 7 presents results from interactions between models to evaluate ato_{t+1}. The model reinforces the influence on RD and investment in assets on ato_{t+1 }and highlights polynomial interactions between per_xrd: avg_xrd and avg_at:per_at as statistically significant and having a direct impact to influence atot+1
Model 5
Table 8 runs KNN regression first on the entire data set, and then on filtering by industry code. The results are highlighted in Panel A and Panel B respectively.
The results in panel A reinforce and emphasize per_xrd, incr_xrd, ato_gr as statistically significant variables. The per_xrd results highlight while the initial investment in RD would lead to substantial growth in ATO, eventually, there would be a saturation point where-in more investment in ATO would not significantly drive ATO. The results in panel B did not drive any meaningful results initially due to error in the calculation of the ato_ld1 variable. On fixing it and re-running the model with variable input, the results of the model drive positive data for the model.
Conclusion
Several useful insights emerge from the model. Through analysis of multiple models, the significance and importance of RD and total assets are emphasis over and over again.
Of the model studies, the basic linear model performs the best and drives maximum insight into the current data set. My interpretation is that RD, in general, has a strong influence on determining asset turnover, and would eventually enable product innovation and differentiation. The sportswear industry being capital intensive, consistent, and significant investments in assets would provide a platform for the company to differentiate itself, leverage economies of scale, and establish itself as a brand. The negative correlation between hhi and atot+1 highlights the challenges of growing in an extremely concentrated and competitive market.
APPENDIX
Table 1
Variable Definitions
per_xrd | Percentage of the company’s investment in RD over total investment in RD within the industry |
incr_xrd | Count to evaluate the increase in RD YoY by evaluating data over the past 5 years |
avg_xrd | Average investment in RD over the years |
salesgr | Percent change in total sales t-1 to t |
salesgr_count | Count to evaluate an increase in Sales YoY by evaluating data over the past 5 years to measure past expectations |
ato | asset turnover ratio of year t: sale/at |
atopr | asset turnover ratio of year t-1 |
ato_gr | Change in ATO t-1 to t |
atogr_count | Count variable to store YoY increase in ATO over the past 5 years |
ato_ld1 | Variable to store next years ATO |
avg_at | Average investment in at = mean(at) |
per_at | Percentage of investment in capital over industry investment in capital |
mktsh | Market share of the company by revenue = sale/sum(sale) |
hhi | Herfindahl index to evaluate market concentration to further determine market competitiveness |
Table 2
Summary Statistics
| Mean | Min | Median | Max |
per_xrd | 0.42405 | 0.0000 | 0.01579 | 8.00790 |
incr_xrd | 1.78300 | 0.0000 | 2.00000 | 5.00000 |
avg_xrd | 186.8413 | 0.0222 | 234.7899 | 377.6486 |
salesgr | 5.317 | -33.900 | 2.829 | 103.914 |
salesgr_count | 3.063 | 0.0000 | 3.0000 | 5.00000 |
ato | 1.057 | 0.00051 | 0.87890 | 9.3644 |
atopr | 1.08033 | 0.00511 | 0.9005 | 13.177 |
ato_gr | -0.00667 | -0.42614 | 0.0000 | 0.63396 |
atogr_count | 3.007 | 0.000 | 3.000 | 5.000 |
ato_ld1 | 7430.22 | 36.69 | 2029.65 | 100184.0 |
avg_at | 9657 | 1832 | 7286 | 86646 |
per_at | 0.636704 | 0.001851 | 0.12615 | 7.94685 |
mktsh | 0.008515 | 0.0000 | 0.00133 | 0.87614 |
ato_ld1* | 1.0404 | 0.0511 | 0.8699 | 3.1670 |
Table 3
Results from linear model to predict ATO for year t+1
Model:
ATO _{t+1 }= α + β_{1 }per xrd + β_{2 }inc xrd + β_{3 }avg xrd + β_{4 }salesgr + β_{5 }salesgr count + β_{6 }ato gr + β_{7 }avg at + β_{8 }per at + β_{9 }hhi
R code snippet:
olsfit- lm(ato_ld1~per_xrd+incr_xrd+avg_xrd+salesgr+salesgr_count+ato_gr+avg_at+per_at+hhi,data =temp)%% summary(olsfit)
Results
Table 4
Results of prediction with forecasts with past sales and ato, and three variables focused around RD
Model:
ATO_{t+1 }= α + β_{1 }incr xrd + β_{2 }sales gr + β_{3 }ato + β_{4 }per xrd + β_{5 }avg xrd
R code snippet:
est-lm(ato_ld1~incr_xrd+salesgr+ato+per_xrd+avg_xrd,data=temp)%% summary(est)
Results:
Table 5
Ensemble methods: Boosting
Model:
ATO _{t+1 }= α + β_{1 }per xrd + β_{2 }inc xrd + β_{3 }avg xrd + β_{4} salesgr + β_{5 }salesgr count + β_{6 }ato gr + β_{7 }avg at + β_{8 }per at + β_{9 }hhi
R code snippet:
boostedtreefit= gbm(ato_ld1~per_xrd+incr_xrd+avg_xrd+salesgr+salesgr_count+ato_gr+avg_at+per_at+hhi,dat a=temp, distribution = "gaussian", n.trees = 10000, shrinkage = 0.001, interaction.depth = 4) summary(boostedtreefit)
Results
Table 6
Graph showing relative influence of each of the variables on prediction
Note: The graph is for illustration purposes. Some variables are not being reflected appropriately on y axis, please refer back to Table 5 for complete information
Table 7
Model leveraging interactions among the variables to evaluate the impact
Model:
ATO_{t+1 }= α + β_{1 }incr xrd + γ avg xrd ∗ per xrd + β_{2 }salesgr count + δ avg at ∗ per at + β_{3 }ato + β_{4 }ato pr
R code snippet:
pnintfit-lm(ato_ld1~incr_xrd+per_xrd*avg_xrd+salesgr_count+avg_at*per_at+ato+atopr, data=temp)%% summary(pnintfit)
Results
Table 8: KNN regression and filtering by industry code
Model:
pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp)
R code snippet:
#pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp)
#pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp)%% summary(pnfit)
chcktest - temp%%
filter(naics=='315220'|naics=='315990')
summary(lm(ato_ld1~predict(pnfit,chcktest),data=chcktest))
Results:
Panel A: KNN regression without filtering by industry code
Panel B: KNN regression on filtering by industry code
Panel C: KNN regression on filtering by industry code after correction to ato_ld1