3 Exploring Data
In this chapter we show how to explore and analyze data using the dataset created in Chapter 2:
discuss frequency and properties of the data (log-returns vs discrete returns)
We load the data and check on it:
#> # A tibble: 2 x 11
#> symbol company identifier sedol weight sector shares_held local_currency
#> <chr> <chr> <chr> <chr> <dbl> <chr> <dbl> <chr>
#> 1 AAPL Apple Inc. 03783310 2046~ 0.0616 Informat~ 168974140 USD
#> 2 MSFT Microsoft~ 59491810 2588~ 0.0607 Informat~ 81130190 USD
#> # ... with 3 more variables: last.sale.price <dbl>, market.cap <dbl>,
#> # ipo.year <int>
3.1 Plotting and Charting Data
In this chapter we show how to create various graphs of financial time series and their properties, which should help us to get a better understanding of their properties, before we go on to calculate and test their statistics.
3.1.1 Time-series plots
We can now directly plot the return series using a bar-chart. In the following, I
stocks.returns.monthly %>% select(symbol, date, return) %>%
ggplot(aes(x=date,y=return,col=symbol)) + geom_bar(stat = "identity") + facet_wrap(~symbol)
Often we want to relate the performance of different investments in a graphical manner. This can be done by assuming an investment of one dollar at a particular point in time and plotting the resulting cumulated timeseries. We aggregate arithmetic returns as \(R_{1:t}=\prod_{s=0}^{t}(1+R_{s})\). The resulting series is depicted below with the y-scale being log-transformed due to the extraordinary perfromance of sum of the companies in relation to others.
stocks.returns.monthly %>% select(symbol, date, return) %>% mutate(wealth=cumprod(1+return)) %>%
ggplot(aes(x=date,y=wealth,col=symbol)) + geom_line() + scale_y_log10()
Another series that is rather important when talking about performance is drawdown which describes the decline from a historical peak in a cumulated return series. Unfortunately the function to calculate Drawdowns()
per se is only available for xts
-input, therefore we stick to only plotting
stocks.returns.monthly %>% select(symbol, date, return) %>% mutate(dd = Drawdowns(return %>% timetk::tx_xts()))
A similar chart can be produced by means of the PerformanceAnalytics
package:
stocks.returns.monthly %>% select(symbol, date, return) %>% mutate(wealth=cumprod(1+return)) %>%
ggplot(aes(x=date,y=wealth,col=symbol)) + geom_line() + scale_y_log10()
3.1.4 Quantile Plots
Putting it all together:
pm <- GGally::ggpairs(iris)
#> Registered S3 method overwritten by 'GGally':
#> method from
#> +.gg ggplot2
if(output %in% c("latex","docx")){
pm
} else if(output == "html"){
plotly::ggplotly(pm)
} else(print("No format defined for this output filetype"))
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Can only have one: highlight
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Can only have one: highlight
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#> Warning: Can only have one: highlight
3.2 Analyzing Data
3.2.1 Calculating Statistics and testing and factor exposure
simple and by using performanceanalytics
through tidyquant
summary statistics, sample mean and covariance estimation higher moments, tests for (multivariate) normality, quantiles and other risk measures per asset/time period, auto-correlation and predictability?
factor analysis, betas, alphas
3.2.6 Exposure to Factors
The stocks in our example all have a certain exposure to risk factors (e.g. the Fama-French-factors we have added to our dataset). Let us specify these exposures by regression each stocks return on the factors Mkt.RF, SMB and HML using the methods from section 1.2.3