Data driven financial analysis

Комментарии · 2081 Просмотры

This study is about predicting an increase in ATO for Under Armour (UAA). The specific question here is that will UAA’s efforts towards product differentiation through Connected Fitness be successful to increase it's Asset Turnover Ratio (ATO). The goal is to evaluate the Key Performance Indicators(KPIs) that lead to an increase in ATO. The data analysis problem is if the companies with high R&D, all seasonal sales, and past high growth been successful in increasing their ATO.

Introduction

The sports industry is highly competitive with low threat of new entrants and requiring a significant amount of capital, human as well as financial, which makes it difficult to enter the market. Additionally, the time, effort, and resources required to gain customer's trust and establish a brand make it difficult for new entrants to differentiate themselves against already established players. 

Under Armour did a phenomenal job establishing itself well in the market against Nike and Adidas. From my research report, I anticipate core profit margin (core-PM) to increase as UAA has been advised to reduce inventory through off-campus channels; and UAA to invest in superior product innovation through Connected Fitness. UAA is expected to increase inventory efficiency, invest in superior technology product innovations; hence be able to generate more $ in sales for every $ in assets. 

Through my data analysis report, I want to evaluate the likelihood of UAA being successful to generate more $ in sales for every $ in assets, and I do so by building models on data available from Compustat on SP 1500 companies. I have used heavy RD, possible product expansion, new technology, past high sales growth as my predictor variables to predict ATO.

Models

1. The first model evaluates RD investment, past sales, ato, investment in assets and market concentration as predictor variables to forecast ATO in year t+1 

ATO t+1 = α + βper xrd + βinc xrd + βavg xrd + βsalesgr + βsalesgr count + βato gr + β7 avg at + βper at + βhhi

2. The second model forecasts with past sales and ATO, and three variables focused around RD: 

ATOt+1 = α + βincr xrd + βsales gr + β3 ato + β4 per xrd + βavg xrd 

3. The third model uses ensemble methods: boosting technique to predict ATO 

ATO t+1 = α + βper xrd + βinc xrd + βavg xrd + β4 salesgr + βsalesgr count + βato gr + βavg at + βper at + βhhi 

4. The fourth model incorporates polynomials and interactions among them to predict ATO in year t+1 

ATOt+1 = α + βincr xrd + γ avg xrd ∗ per xrd + βsalesgr count + δ avg at ∗ per at + βato + βato pr

5. The fifth model runs KNN regression and filters by industry code to predict ato for the year t+1 

pnfit-lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp) 

Data

The data is from Compustat from 1991 through 2017 for all SP 1500 companies. To avoid outliers, I have winsorized the variables and then used winsorized data to feed my prediction model. Any value less than 2nd percentile is set to 2nd percentile and values greater than 98th percentile is set to the 98th percentile. Table 1 in Appendix defines the individual variables used. 

I have chosen these variables because I feel they sufficiently capture the features I expect to predict ATO. Under Armour wants to differentiate itself through superior product innovation through Connected Fitness. To come up with unique and innovative ways of differentiating itself against the competitors, I anticipate continued high investment in RD. To further evaluate the hypothesis, I have performed an evaluation on RD against three metrics: (i) RD investment of the company as a percentage of the industry RD (ii) Investment growth in RD over past 5 years (iii) Average RD over past 5 years 

Under Armour successfully managed to establish itself and earn a substantial market share in an industry that has a high competitive rivalry with Nike and Adidas as established leaders. Hence, I have used past sales growth and Herfindahl index as a predictor variable for the model. (i) sales growth over the past year (ii) count of sales over the past 5 years (iii) Herfindahl index, HHI 

The sportswear industry is capital intensive and considerations for product expansion can require further investment in assets, and hence I have used investments in assets as a predictor metric in my model (i) average investment in assets (ii) percentage of investment in capital over industry investment in capital 

Further, I want to use past ATO as a predictor for future ATO. For that I defined predictor variables to capture (i) ATO for the current year, t (ii) ATO for the past year, t-1 (iii) magnitude of ATO change in the past one year (iv) count variable to capture ATO growth over past 5 years 

Results

Out of the various models I tried and tested, I am highlighting results where I could derive the most understanding from evaluating the results, and in my opinion, worked best for my analysis. 

Model 1: Results from linear model to predict ATO for year t+1

Table 3 presents results from linear model to predict ATO for the year t+1. The data for this model is the winsorized set of predictor variables capturing RD investment, past sales, ATO, investment in assets, and market concentration. The results highlight per_xrd which is the percentage of company’s investment in RD over total investment in RD within the industry, inc_xrd - a count variable to evaluate the increase in RD YoY over past 5 years, avg_xrd average investment in RD, avg_at average investment in total assets, per_at percentage of investment in capital over industry investment in capital, and hhi Herfindahl index as statistically significant variables. 

The results highlight a positive correlation between per_xrd and atot+1, negative correlation between incr_xrd and atot+1and positive correlation between avg_xrd and atot+1I interpret the results as RD, in general, having a strong influence on determining the asset turnover, and this eventually enables product differentiation through high and consistent investment in RD. I would also interpret the results of a negative correlation between incr_xrd and atot+1 as an investment in RD may not immediately translate into higher sales or optimized use of existing assets. 

The results also show that avg_at and atot+1 are positively correlated, and per_at and atot+1 are strongly positively related. Based on my knowledge and understanding, the results of analysis from the general data set fit well for predictions within the sportswear industry, which is capital intensive. A consistent and significant amount of investments in assets could result in higher sales would provide a platform for the company to differentiate itself, leverage economies of scale, and establish itself as a brand. I interpret the negative correlation between hhi and atot+1 as despite heavy investment in RD and efforts towards product differentiation if the market is extremely concentrated and competitive, it would be difficult to increase sales and hence having an inverse relationship with future ATO. 

Model 2: Results of prediction with forecasts with past sales and ATO, and three variables focused around RD

Table 4 depicts results of prediction with forecasts with past sales and ATO, and three variables focused around RD. The results reinforce the analysis and interpretation from model 1 on the influence of RD with incr_xrd, per_xrd, and avg_xrd as statistically significant variables, and highlight an inverse relation between present ATO and atot+1. On initial analysis, this inverse relation is counter-intuitive to my understanding. 

Model 3: Ensemble methods - Boosting

Table 5 presents the results from incorporating the majority of the predictor variables in boosted tree fit to drive projection for ATO for year t+1. The results highlight the relative influence of each of the predictor variables on the outcome variable. The results show that the percentage of total assets compared to the total investment in assets plays the most crucial role in driving asset turnover, with influence being as high as 50.26%. This is followed by average investment in RD with an influence of 25% and average investment in assets with an influence of 10.27% followed by the influence of industry concentration of 9.71%. 

On a deeper analysis of results, I feel since I took the entire data set is taken into consideration the general picture may not be a reflection of my specific industry. However, the insights do provide a reference point for further analysis and an overall big picture to start with as I seek to narrow down towards industry-specific results. 

Table 6 provides a graph for illustration purposes to be able to visualize the influence in a better manner. 

Model 4

Table 7 presents results from interactions between models to evaluate atot+1The model reinforces the influence on RD and investment in assets on atot+1 and highlights polynomial interactions between per_xrd: avg_xrd and avg_at:per_at as statistically significant and having a direct impact to influence atot+1 

Model 5

Table 8 runs KNN regression first on the entire data set, and then on filtering by industry code. The results are highlighted in Panel A and Panel B respectively. 

The results in panel A reinforce and emphasize per_xrd, incr_xrd, ato_gr as statistically significant variables. The per_xrd results highlight while the initial investment in RD would lead to substantial growth in ATO, eventually, there would be a saturation point where-in more investment in ATO would not significantly drive ATO. The results in panel B did not drive any meaningful results initially due to error in the calculation of the ato_ld1 variable. On fixing it and re-running the model with variable input, the results of the model drive positive data for the model. 

Conclusion

Several useful insights emerge from the model. Through analysis of multiple models, the significance and importance of RD and total assets are emphasis over and over again. 

Of the model studies, the basic linear model performs the best and drives maximum insight into the current data set. My interpretation is that RD, in general, has a strong influence on determining asset turnover, and would eventually enable product innovation and differentiation. The sportswear industry being capital intensive, consistent, and significant investments in assets would provide a platform for the company to differentiate itself, leverage economies of scale, and establish itself as a brand. The negative correlation between hhi and atot+1 highlights the challenges of growing in an extremely concentrated and competitive market. 

 

APPENDIX

Table 1

Variable Definitions

 

per_xrd

Percentage of the company’s investment in RD over total investment in RD within the industry

incr_xrd

Count to evaluate the increase in RD YoY by evaluating data over the past 5 years

avg_xrd

Average investment in RD over the years

salesgr

Percent change in total sales t-1 to t

salesgr_count

Count to evaluate an increase in Sales YoY by evaluating data over the past 5 years to measure past expectations

ato

asset turnover ratio of year t: sale/at

atopr

asset turnover ratio of year t-1

ato_gr

Change in ATO t-1 to t

atogr_count

Count variable to store YoY increase in ATO over the past 5 years 

ato_ld1

Variable to store next years ATO

avg_at

Average investment in at = mean(at)

per_at

Percentage of investment in capital over industry investment in capital

mktsh

Market share of the company by revenue = sale/sum(sale)

hhi

Herfindahl index to evaluate market concentration to further determine market competitiveness

 

Table 2

Summary Statistics

 

 

 

 

Mean

Min

Median

Max

per_xrd

0.42405

0.0000

0.01579

8.00790

incr_xrd

1.78300

0.0000

2.00000

5.00000

avg_xrd

186.8413

0.0222

234.7899

377.6486

salesgr

5.317

-33.900

 2.829

103.914

salesgr_count

3.063

0.0000

3.0000

5.00000

ato

1.057

0.00051

0.87890

9.3644

atopr

1.08033

0.00511

0.9005

13.177

ato_gr

-0.00667

-0.42614

0.0000

0.63396

atogr_count

3.007

0.000

3.000

5.000

ato_ld1

7430.22

36.69

2029.65

100184.0

avg_at

9657

1832

7286

86646

per_at

0.636704

0.001851

0.12615

7.94685

mktsh

0.008515

0.0000

0.00133

0.87614

ato_ld1*

1.0404

0.0511

0.8699

3.1670

 

Table 3

Results from linear model to predict ATO for year t+1

 

Model:

ATO t+1 = α + βper xrd + βinc xrd + βavg xrd + βsalesgr + βsalesgr count + βato gr + βavg at + βper at + βhhi

R code snippet: 

olsfit- lm(ato_ld1~per_xrd+incr_xrd+avg_xrd+salesgr+salesgr_count+ato_gr+avg_at+per_at+hhi,data =temp)%% summary(olsfit) 

Results

 

Table 4

Results of prediction with forecasts with past sales and ato, and three variables focused around RD

 

Model:

ATOt+1 = α + βincr xrd + βsales gr + βato + βper xrd + βavg xrd

R code snippet:

est-lm(ato_ld1~incr_xrd+salesgr+ato+per_xrd+avg_xrd,data=temp)%% summary(est)

Results:

Table 5

Ensemble methods: Boosting

 

Model:

ATO t+1 = α + βper xrd + βinc xrd + βavg xrd + β4 salesgr + βsalesgr count + βato gr + βavg at + βper at + βhhi

R code snippet: 

boostedtreefit= gbm(ato_ld1~per_xrd+incr_xrd+avg_xrd+salesgr+salesgr_count+ato_gr+avg_at+per_at+hhi,dat a=temp, distribution = "gaussian", n.trees = 10000, shrinkage = 0.001, interaction.depth = 4) summary(boostedtreefit)

Results

 

 

Table 6

Graph showing relative influence of each of the variables on prediction

Note: The graph is for illustration purposes. Some variables are not being reflected appropriately on y axis, please refer back to Table 5 for complete information

Table 7

Model leveraging interactions among the variables to evaluate the impact

Model:

ATOt+1 = α + βincr xrd + γ avg xrd ∗ per xrd + βsalesgr count + δ avg at ∗ per at + βato + βato pr

R code snippet: 

pnintfit-lm(ato_ld1~incr_xrd+per_xrd*avg_xrd+salesgr_count+avg_at*per_at+ato+atopr, data=temp)%% summary(pnintfit) 

Results

Table 8: KNN regression and filtering by industry code

Model:

pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp) 

R code snippet:

#pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp) 

#pnfit- lm(ato_ld1~poly(per_xrd,3,raw=TRUE)+poly(incr_xrd,3,raw=TRUE)+poly(avg_xrd,3,raw=TRUE) +poly(salesgr,3,raw=TRUE)+poly(ato_gr,3,raw=TRUE),data=temp)%% summary(pnfit) 

chcktest - temp%% 

filter(naics=='315220'|naics=='315990') 

summary(lm(ato_ld1~predict(pnfit,chcktest),data=chcktest)) 

Results:

Panel A: KNN regression without filtering by industry code

Panel B: KNN regression on filtering by industry code

Panel C: KNN regression on filtering by industry code after correction to ato_ld1

Комментарии