Core Functions in tidyquant - The Comprehensive R Archive Network followed by a non-number. using the quantmod function to.weekly A customized function called yf_convert_to_wide() can transform the long dataframe into a wide format (tickers as columns), much used in portfolio optimization. xts is a extensible time series package for time series data. Try These 2 Packages, Python List Print 7 Different Ways to Print a List You Must Know, How to get the most and least Volatile Cryptocurrencies, How to Get Cryptocurrency Data from Kraken API in Python, Click here to close (This popup will not appear again). > # in later model fitting. mutate(freq = 'yearly') img#wpstats{display:none} A third average called signal line; a 9 day exponential moving average of MACD signal, is also computed. Your data may not have an observation exactly at the market open and/or close. and a look at what's coming in mutate(freq = factor(freq, To learn more, see our tips on writing great answers. This returns stock price data from Yahoo Finance. I did an extensive amount of research a while ago on this and unfortunately didn't get anywhere. Remember that getSymbols() specifies the data source using the src argument and that Quandl() specifies it as part of the Quandl code (i.e.database/series). Recall endpoints() returns locations of the last observation in each period specified by the on argument. getSymbols.yahoo() always throws "Error in new.session()" #358 - GitHub tq_get function - RDocumentation getSymbols() will error if the data is not available for download. l_wide " /> An earlier exercise taught you how to use setSymbolLookup() to set a default data source for getSymbols(). You will use the same split-lapply-rbind paradigm from the Introduction to xts and zoo course. Spline interpolation is more appropriate for series without a strong trend, because it calculates a non-linear approximation using multiple data points. To subscribe to this RSS feed, copy and paste this URL into your RSS reader. Ask Question Asked 9 years, 11 months ago Modified 3 years, 8 months ago Viewed 2k times Part of R Language Collective 6 ## (3/3) Fetching data for ', Thanks for contributing an answer to Stack Overflow! ## - got 100% of valid prices -- Good job msperlin! Here we stick to the standard parameters. You will still use read.csv() find the column names of the date and time columns. We use cookies (necessary for website functioning) for analytics, to give you the its, zoo, xts, or timeSeries. For example, the argument for the first date of data importation in yfR::yf_get() is first_date, and not first.date as used in BatchGetSymbols::BatchGetSymbols. Value Syntactically valid names contain letters, numbers, ., and "_, and must start with a letter or a." Both getSymbols() and Quandl() provide access to the FRED database. It provides a rapid prototyping environment that makes modeling easier by removing the repetitive workflow issues surrounding data management and visualization. Meant to be called internally by getSymbols (see also). Modelling & Trading Framework for, > getSymbols("YHOO",src="google") # from google finance. + VXN=list(name='^VIX',src='yahoo')) The quantmod package provides functions to extract a single column, and also has functions to extract specific sets of columns. In this case, well-formatted means the file contains data for a single instrument with date, open, high, low, close, volume, and adjusted close columns, in that order. MACD is the function in quantmod that calculates the moving average convergence divergence, data is the closing price for NSE, nFast is the fast moving average, nSlow is the slow moving average, maType =SMA indicates we have chosen simple moving average, percent =FALSE implies we are calculating the difference between fast moving average and slow moving average. At present this is a nice tool to If Exp is missing, only the front month contract will be returned. A new feature called collection, which allows for easy download of a collection of tickers. Any Idea? For example, getSymbols(CP, src = yahoo) would load Canadian Pacific Railway data from the New York Stock Exchange. Is it appropriate to try to contact the referee of a paper after it has been accepted and published? Works..Thank you for your help, @Michael Kirchner. 'yearly'))) # make sure the order in plot is right These data sets often need to be aggregated before you can work with them. This exercise provides an example. > candleChart(AAPL,multi.col=TRUE,theme="white") For example, a 2-for-1 split would double the shares outstanding, and reduce the stock price by 1/2. a data object with all the columns prespecified, OR to use objects visible The london objects timezone is Europe/London and the chicago objects timezone is America/Chicago. You will use the split argument to import the data into an object that has both bid and ask prices for both instruments on one row. ts (time-series objects from the stats package). As an alternative somebody point to this script that could help you. You will often need to aggregate to a lower frequency to align multiple time series. This exercise will use to.period() to aggregate intraday data to an OHLC series. The difference between the fast moving average and the slow moving average is called MACD line. A replacement for anything statistical. You will use it to import QQQ data from Yahoo! Recall that xts objects store the time index as seconds since midnight, 1970-01-01 in the UTC timezone. setSymbolLookup merge() uses this underlying index and returns a result with the first objects timezone. How to extract all the ticker symbols of an exchange with Quantmod in R? of my visitors seem to appear), I will continue. You will learn more about the adjustment process in the next video. costly, data errors in trading can quickly lead to a new career. Remember, you are not supposed to call getSymbols.yahoo() directly! Description. Having the flexibility to return different data types is a great bonus and less work for you! specifyModel is the workhorse function a variety of sources, including Now that we have some data we may want to look at it. The Quandl() function allows you to specify common aggregations and transformations via the collapse and/or transform arguments. We will choose MACD (Moving Average Convergence Divergence) for this example. Yahoo Finance cover a large number of markets and assets, being used extensively for importing price datasets used in academic research and teaching. 3,167.75 -1.78 (-0.06%) At close: 03:00PM CST Stock chart is not supported by your current browser Yahoo Finance UK FTSE and Wall Street rise as UK government borrowing falls but debt still. You can use setSymbolLookup() to specify the default data source for an instrument. How about some examples of quantmod's data handling, This software is written and maintained by, Quantitative Financial Once again, you can use read.zoo(). You can supply your own end points to period.apply() (versus using endpoints()). The performance analytics package in R provides a consolidated platform to observe performance related parameters. Recently, Yahoo Finance - a popular source of free end-of-day price data - made some changes to their server which wreaked a little havoc on anyone relying on it for their algos or simulations. Is saying "dot com" a valid clue for Codenames? QQQ is an exchange-traded fund that tracks the Nasdaq 100 index, and Yahoo! along with the appropriate data object to search. So, you must adjust all pre-split prices in order to calculate historical returns correctly. document.documentElement.classList.add( - Stack Overflow How to get the list of all Yahoo Finance mutual funds in R? This time you will be using its split argument, which allows you to specify the name or number of the columns(s) that contain the variables that identify unique observations. This method is For example, collection SP500 represents the current composition of the SP500 index. ## - found cache file (2010-01-04 --> 2022-03-30) Maybe it's not possible within Quantmod, but perhaps another way of doing it? levels = c('daily', Enter the Are there any practical use cases for subtyping primitive types? This method is not to be called directly, instead a call to getSymbols(Symbols,src='yahooj') will in turn call this method. A call to getSymbols.yahoo will load into the specified ## Running yfR for 1 stocks | 2010-01-01 --> 2022-03-31 (4472 days) Most markets are usually closed at least part of the day. most appropriate way. In the past I have used function GetSymbols from the CRAN package quantmod in order to download end of day trade data for several stocks in the financial market. It can also extract columns for a specific instrument by using the symbol argument, which is useful when an object contains several instruments with the same price type. Setting a default source can be useful if you use that source often. Can consciousness simply be a brute fact connected to some physical processes that dont need explanation? If you steal opponent's Ring-bearer until end of turn, does it stop being Ring-bearer even at end of turn? within the user's environment. Much, Lets's take a look at an example of specifyModel: Stock splits can create large historical price changes even though they do not change the value of the company. as data handling is inherently error-prone. [1] "AAPL" Note there is So the close prices from the new data wont align closely with the adjusted close prices from the previous exercise (which are adjusted for both splits and dividends). 'jetpack-lazy-images-js-enabled' As of quantmod 0.4-9, getSymbols.yahoo has been patched to work with changes to Yahoo Finance, which also included the following changes to the raw data: The adjusted close column appears to no longer include dividend adjustments The close column appears to be adjusted for splits twice Description Retrieve current conversion rate between two currencies as well as historical rates. As of quantmod 0.4-9, getSymbols.yahoo has been patched to work with changes to Yahoo Finance, which also included the following changes to the raw data: The adjusted close column appears to no longer include dividend adjustments The close column appears to be adjusted for splits twice In this exercise, you will learn a general aggregation technique to aggregate daily data to weekly, but with weeks ending on Wednesdays. */ Automated Stock Trading with R - Towards Data Science You can also use setSymbolLookup() to create a mapping between the instrument symbol and the name of the R object. The xts class extends zoo, so you can easily convert the result of read.zoo() into an xts object by using as.xts(). Dividends do reduce the companys value by the amount of the dividend payment, but the investors return isnt affected because they receive the offsetting dividend payment. yfR is the second and backwards-incompatible version of BatchGetSymbols. fucntion of the current period open to close and the current close The Yahoo Finance symbol for the SSE Composite Index is 000001.SS. You can also use the getSymbolLookup() function to check per-symbol defaults before you try to load data using getSymbols(). Available as part of the Tidelift Subscription. Share Follow answered Mar 17, 2021 at 11:54 H.L. Its important to check your imported data is reasonable. Then you will use the series name to download the data directly from FRED using getSymbols(), and from the Quandl database using Quandl(). returns = ROC (data)*signal. A car dealership sent a 8300 form after I paid $10k in cash for a car. Intraday data can be huge, with hundreds of thousands of observations per day, millions per month, and hundreds of millions per year. Now you will create regular intra-day data from an irregular series. timeSeries (from the timeSeries package in the RMetrics suite). Am I in trouble. > getSymbols("AAPL",src="yahoo") I've found the end of day data from tiingo to be higher quality and the endpoint is more stable than Yahoo Finance. Does glide ratio improve with increase in scale? Downloads Symbols to specified env from 'finance.yahoo.co.jp'. The trading signal is applied to the closing price to obtain the returns of our strategy. In addition to this, you can check our blog for articles on different quantitative trading strategies. and data structure implying the next (Next) period's open to close new function chartSeries. Today Im releasing the first version of yfR (not yeat in CRAN). You use the Symbols argument to specify the instrument (i.e.the ticker symbol) you want to import. The command below creates chart for the NSE data. This exercise will teach you how to save and load symbol-based defaults by using saveSymbolLookup() and loadSymbolLookup(), respectively. We use the lag operator to eliminate look ahead bias. > You often need to specify both period and k arguments to aggregate intraday data. The call to modelData extracts the relevant data set, with transforms magically This exercise will show you how to fill missing values by trading day, without using the prior days final value. Also like lapply(), eapply() returns a list. simple wrapper to the underlying Yahoo! [1] "XPTUSD" Their website has documentation for it all! quantmod Share Follow edited Feb 27, 2021 at 0:53 asked Feb 26, 2021 at 18:02 user1357015 11.1k 22 66 109 Add a comment 1 Answer Sorted by: 0 Sadly quantmod can no longer access google's market information as per this issue. For example, you can download price data for all components of the SP500 by simply calling yfR::yf_collection_get("SP500"). The trading signal is applied to the closing price to obtain the returns of our strategy. 1 Answer Sorted by: 0 A fast way of doing this is using tidyquant. The getSymbols() function from the quantmod package provides a consistent interface to import data from various sources into your workspace. The data have already been loaded to your workspace in aapl_raw and aapl_split_adjusted, respectively. Occasionally this happens, and it is useful to set a single symbol to be pulled from a specific source. At its worst it can be dangerous The ROC function provides the percentage difference between the two closing prices. In this exercise, you will use the index.column argument to specify the date and time columns of the file. This in a major upgrade on BatchGetSymbols, with many backwards-incompatible changes. Presently this may be ts, Until that fix is released on CRAN, you can try installing the development version. To calculate the dividend ratio, you need to provide raw dividends and raw prices via the dividends and close arguments, respectively. quantmod (version 0.4.18) getSymbols.yahoo: Download OHLC Data From Yahoo Finance to specified from 'finance.yahoo.com'. aapl_prices <-tq_get ("AAPL", get = "stock.prices", from =" 1990-01-01") . ---
title: "Importing and Managing Financial Data in R"
output:
  html_notebook:
    toc: true
    toc_float: true
    toc_collapsed: false
    number_sections: true
    
toc_depth: 3
---

# Introduction and downloading data

## Introduction and downloading data

### Introducing getSymbols()

The getSymbols() function from the quantmod package provides a consistent interface to import data from various sources into your workspace. By default, getSymbols() imports the data as a xts object.

This exercise will introduce you to getSymbols(). You will use it to import QQQ data from Yahoo! Finance. QQQ is an exchange-traded fund that tracks the Nasdaq 100 index, and Yahoo! Finance is the default data source for getSymbols().

You use the Symbols argument to specify the instrument (i.e. the ticker symbol) you want to import. Since Symbols is the first argument to getSymbols(), you usually just type the instrument name and omit Symbols =.
```{r}
# Load the quantmod package
library(quantmod)

# Import QQQ data from Yahoo! Finance
getSymbols(Symbols = "QQQ", auto.assign = TRUE)

# Look at the structure of the object getSymbols created
str(QQQ)

# Look at the first few rows of QQQ
head(QQQ)
```
### Data sources

In the last exercise, you imported data from Yahoo! Finance. The src argument allows you to tell getSymbols() to import data from a different data source.

In this exercise, you will import data from Alpha Vantage and FRED. Alpha Vantage is a source similar to Yahoo! Finance. FRED is an online database of economic time series data created and maintained by the Federal Reserve Bank of St. Louis.

getSymbols() imports data from Yahoo! Finance by default because src = "yahoo" by default. The src values for Alpha Vantage and FRED are "av" and "FRED", respectively.
```{r}
# Import QQQ data from Alpha Vantage
# getSymbols(Symbols = "QQQ", src = "av")
# https://www.alphavantage.co/ to get API key
# Look at the structure of QQQ
# str(QQQ)

# Import GDP data from FRED
getSymbols(Symbols = "GDP", src = "FRED")

# Look at the structure of GDP
str(GDP)
```
### Make getSymbols() return the data it retrieves
In the last exercise, getSymbols() automatically created an object named like the symbol you provided. This exercise will teach you to make getSymbols() return the data, so you can assign the output yourself.

There are two arguments that will make getSymbols() return the data:

1. Set auto.assign = FALSE.
2. Set env = NULL.

The two methods are functionally equivalent, but auto.assign = FALSE describes the behavior better. Use it because you will be more likely to remember what auto.assign = FALSE means in the future.
```{r}
# Assign SPY data to 'spy' using auto.assign argument
spy <- getSymbols(Symbols = "SPY", auto.assign = FALSE)

# Look at the structure of the 'spy' object
str(spy)

# Assign JNJ data to 'jnj' using env argument
jnj <- getSymbols(Symbols = "JNJ", env = NULL)

# Look at the structure of the 'jnj' object
str(jnj)
```
Turning off auto.assign is useful if you want to assign the data to an object yourself.

## Introduction to Quandl

Similar to how the quantmod package provides getSymbols() to import data from various sources, the Quandl package provides access to the Quandl databases via one simple function: Quandl().

Recall that getSymbols() uses the Symbols and src arguments to specify the instrument and data source, respectively. The Quandl() function specifies both the data source and the instrument via its code argument, in the form "DATABASE/DATASET".

Two other ways Quandl() differs from getSymbols() are:

1. Quandl() returns a data.frame by default.
2. Quandl() will not automatically assign the data to an object.

If you plan on importing a lot of data using Quandl(), you might consider opening a free account with them in order to get an API key.

```{r}
# Load the Quandl package
library(Quandl)

# Import GDP data from FRED
# gdp <- Quandl(code = "FRED/GDP")

# Look at the structure of the object returned by Quandl
# str(gdp)
```
Quandl provides access to a large amount of data series. Their website has documentation for it all!

### Return data type

The Quandl() function returns a data.frame by default. It can return other classes via the type argument.

The possible values for type are:

1. "raw" (a data.frame),
2. "ts" (time-series objects from the stats package),
3. "zoo",
4. "xts", and
5. "timeSeries" (from the timeSeries package in the RMetrics suite).

In this exercise, you will learn how to use the type argument to make Quandl() return an xts and a zoo object.

```{r}
# Import GDP data from FRED as xts
# gdp_xts <- Quandl(code = "FRED/GDP", type = "xts")

# Look at the structure of gdp_xts
# str(gdp_xts)
 
# Import GDP data from FRED as zoo
# gdp_zoo <- Quandl(code = "FRED/GDP", type = "zoo")

# Look at the structure of gdp_zoo
# str(gdp_zoo)
```
Having the flexibility to return different data types is a great bonus and less work for you!

## Finding data from internet sources

### Find stock ticker from Yahoo Finance

You need the instrument identifier to import data from an internet data source. They can often be found on the data source website. In this exercise, you will search Yahoo Finance for the ticker symbol for Pfizer stock.

Note that some sources may not provide data for certain symbols, even if you can see the data displayed on their website in tables and/or charts. getSymbols() will error if the data is not available for download.
```{r}
# Create an object containing the Pfizer ticker symbol
symbol <- "PFE"

# Use getSymbols to import the data
getSymbols("PFE")

# Look at the first few rows of data
head(PFE)
```
Looking up identifiers online is common when seeking data about a new instrument, so it's good to get comfortable with the process!

### Download exchange rate data from Oanda

Oanda.com provides historical foreign exchange data for many currency pairs. Currency pairs are expressed as two currencies, the "base" and the "quote", separated by a "/". For example, the U.S. Dollar to Euro exchange rate would be "USD/EUR".

Note that getSymbols() will automatically convert "USD/EUR" to a valid name by removing the "/". For example, getSymbols("USD/EUR") would create an object named USDEUR.

Also, Oanda.com only provides 180 days of historical data. getSymbols() will warn and return as much data as possible if you request data from more than 180 days ago. You can use the from and to arguments to set a date range; both should be strings in "%Y-%m-%d" format (e.g. "2016-02-06").

quantmod::oanda.currencies contains a list of currencies provided by Oanda.com.
```{r}
# Create a currency_pair object
currency_pair <- "GBP/CAD"

# Load British Pound to Canadian Dollar exchange rate data
getSymbols(currency_pair, src = "oanda")

# Examine object using str()
str(GBPCAD)

# Try to load data from 190 days ago
getSymbols(currency_pair, from = Sys.Date() - 190, to = Sys.Date(), src = "oanda")
```
### Find and import Unemployment Rate data from FRED

Both getSymbols() and Quandl() provide access to the FRED database. In this exercise, you will find the [FRED](https://fred.stlouisfed.org/) symbol for the United States civilian unemployment rate. Then you will use the series name to download the data directly from FRED using getSymbols(), and from the Quandl database using Quandl().

Remember that getSymbols() specifies the data source using the src argument and that Quandl() specifies it as part of the Quandl code (i.e. database/series).
```{r}
# Create a series_name object
series_name <- "UNRATE"

# Load the data using getSymbols
getSymbols(series_name, src = "FRED")

# Create a quandl_code object
# quandl_code <- "FRED/UNRATE"

# Load the data using Quandl#
# unemploy_rate <- Quandl(quandl_code)
```
# Extracting and transforming data

## Extract one column from one instrument

The quantmod package provides several helper functions to extract specific columns from an object, based on the column name. The Op(), Hi(), Lo(), Cl(), Vo(), and Ad() functions can be used to extract the open, high, low, close, volume, and adjusted close column, respectively.

In this exercise, you will use two of these functions on an xts object named DC. The DC object contains fictitious DataCamp OHLC (open, high, low, close) stock prices created by randomizing some real financial market data. DC is similar to the xts objects created by getSymbols().

While it's not necessary to complete the exercise, you can learn more about all the extractor functions from help("OHLC.Transformations").
## Extracting columns from financial time series

```{r}
load(file = "DC.RData")
```
```{r}
DC <- DC[,c(1,2)]
DC <- to.hourly(DC, indexAt = "startof")
```

```{r}
library(quantmod)
# Extract the close column
dc_close <- Cl(DC)

# Look at the head of dc_close
head(dc_close)

# Extract the volume column
dc_volume <- Vo(DC)

# Look at the head of dc_volume
head(dc_volume)
```
### Extract multiple columns from one instrument

The quantmod package provides functions to extract a single column, and also has functions to extract specific sets of columns.

Recall OHLC stands for open, high, low, close. Now you can guess which columns the OHLC() and HLC() functions extract. There's also an OHLCV() function, which adds the volume column.

These functions are helpful when you need to pass a set of columns to another function. For example, you might need to pass the high, low, and close columns (in that order) to a technical indicator function.
```{r}
# Extract the high, low, and close columns
dc_hlc = HLC(DC)

# Look at the head of dc_hlc
head(dc_hlc)

# Extract the open, high, low, close, and volume columns
dc_ohlcv = OHLCV(DC)

# Look at the head of dc_ohlcv
head(dc_ohlcv)
```
### Use getPrice to extract other columns

The extractor functions you learned in the previous two exercises do not cover all use cases. Sometimes you might have one object that contains the same price column for multiple instruments. Other times, you might have an object with price data (e.g. bid, ask, trade) that do not have an explicit extractor function.

The getPrice() function in the quantmod package can extract any column by name by using the prefer argument. It can also extract columns for a specific instrument by using the symbol argument, which is useful when an object contains several instruments with the same price type.

You can use regular expressions for both the prefer and symbol arguments, because they are passed to the base::grep() function internally.

```{r include=FALSE}
#api
Quandl.api_key("XSZePic12jz9CNf2uVh-")
```
```{r}
# Download CME data for CL and BZ as an xts object
oil_data <- Quandl(code = c("CHRIS/CME_QX7", "CFTC/067653_FO_L_ALL_CR"), type = "xts")

# Look at the column names of the oil_data object
colnames(oil_data)

# Extract the Open price for CLH2016
cl_open <- getPrice(oil_data, symbol = "CME_QX7", prefer = "Open$")

# Look at January, 2016 using xts' ISO-8601 subsetting
cl_open["2016-01"]
```
getPrice() is a very flexible way to retrieve the columns you need.

## Importing and transforming multiple instruments


### Use Quandl to download weekly returns data

Sometimes you need to aggregate and/or transform raw data before you can continue your analysis. The Quandl() function allows you to specify common aggregations and transformations via the collapse and/or transform arguments. The Quandl API takes care of the details for you.

```{r}
# Download quarterly CL and BZ prices
qtr_price <- Quandl(code = c("CHRIS/CME_QM1", "CHRIS/CME_QG1"), collapse = "quarterly",type = "xts")

# View the high prices for both series
Hi(qtr_price)

# Download quarterly CL and BZ returns
qtr_return <- Quandl(code = c("CHRIS/CME_QM1", "CHRIS/CME_QG1"), collapse = "quarterly",transform = "rdiff", type = "xts")

# View the settle price returns for both series
getPrice(qtr_return, prefer = "Settle")
```
### Combine many instruments into one object

What if you need to aggregate or transform your data in ways Quandl() does not support? In those cases, you can use the flexibility of R.

One paradigm involves importing data into a new environment. Then you can use eapply() to call a function on each object in the environment, much like what lapply() does for each element of a list. Also like lapply(), eapply() returns a list.

Then you can merge all the elements of the list into one object by using do.call(), which is like having R programmatically type and run a command for you. Instead of typing merge(my_list[[1]], my_list[[2]]], ...), you can type do.call(merge, my_list).
```{r}
# Create new environment
data_env <- new.env()
# Use getSymbols to load data into the environment
getSymbols(c("SPY", "QQQ"), env = data_env, auto.assign = TRUE)
```
```{r}
# Look at a few rows of the SPY data
head(data_env$SPY, 3)
```
```{r}
# Look at a few rows of the SPY data
eapply(data_env, head)
```
```{r}
# Call head on each object in data_env using eapply
data_list <- eapply(data_env, head)

# Merge all the list elements into one xts object
data_merged <- do.call(merge, data_list)

# Ensure the columns are ordered: open, high, low, close
data_ohlc <- OHLC(data_merged)
data_ohlc
```
```{r}
# Extract volume column from each object
adjusted_list <- lapply(data_env, Ad)
# Merge each list element into one object
adjusted <- do.call(merge, adjusted_list)
head(adjusted)
```
### Extract the Close column from many instruments

The previous exercise taught you how to use do.call(merge, eapply(env, fun)) to apply a function to each object in an environment and then combine all the results into one object.

Let's use what you learned to solve a very common problem. Often you will need to load similar data for many instruments, extract a column, and create one object that contains that specific column for every instrument.
```{r}
# Symbols
symbols <- c("AAPL", "MSFT", "IBM")

# Create new environment
data_env <- new.env()

# Load symbols into data_env
getSymbols(symbols, env = data_env)

# Extract the close column from each object and combine into one xts object
close_data <- do.call(merge, eapply(data_env, Cl))

# View the head of close_data
head(close_data)
```

# Managing data from multiple sources

## Setting default arguments for getSymbols()

### Set a default data source

Recall that getSymbols() imports from Yahoo Finance by default. This exercise will teach you how to change the default data source with the setDefaults() function.

The first argument to setDefaults() is the function you want to update, and the remaining arguments are name = value pairs of the arguments you want to update and the new default value.

Note that this only works with getSymbols() because getSymbols() actively checks to see if you want to use a different default value.
```{r}
# Set the default to pull data from Alpha Vantage
setDefaults(getSymbols, src = "av")

# Get GOOG data
# getSymbols("GOOG")

# Verify the data was actually pulled from Alpha Vantage
# str(GOOG)
```
Setting a default source can be useful if you use that source often.

### Set default arguments for a getSymbols source

You can also use setDefaults() on individual getSymbols() source methods. This exercise will teach you how to change the default value for the from argument to getSymbols.yahoo().

You can find the arguments for a specific method by using help() (e.g. help("getSymbols.yahoo") or by calling args() to print them to the console (e.g. args(getSymbols.yahoo)). Calling getDefaults() will show you the current default values (if there are any).

Remember, you are not supposed to call getSymbols.yahoo() directly!
```{r}
# Look at getSymbols.yahoo arguments
args(getSymbols.yahoo)

# Set default 'from' value for getSymbols.yahoo
setDefaults(getSymbols.yahoo, from = "2000-01-01")

# Confirm defaults were set correctly
getDefaults("getSymbols.yahoo")
```
## Setting per-instrument default arguments

### Set default data source for one symbol

Changing the default source for one instrument is useful if multiple sources use the same symbol for different instruments. For example, getSymbols("CP", src = "yahoo") would load Canadian Pacific Railway data from the New York Stock Exchange. But getSymbols("CP", src = "FRED") would load Corporate Profits After Tax from the U.S. Bureau of Economic Analysis.

You can use setSymbolLookup() to specify the default data source for an instrument. In this exercise, you will learn how to make getSymbols("CP") load the corporate profit data from FRED instead of the railway stock data from Yahoo Finance.

setSymbolLookup() can take any number of name = value pairs, where name is the symbol and value is a named list of getSymbols() arguments for that one symbol.
```{r}
setDefaults(getSymbols, src = "yahoo")
setSymbolLookup("CP" = "yahoo")
# Load CP data again
getSymbols("CP")
# Look at the first few rows of CP
head(CP)

setSymbolLookup("CP" = NULL)

# Set the source for CP to FRED
setSymbolLookup("CP" = "FRED")

# Load CP data again
getSymbols("CP")

# Look at the first few rows of CP
head(CP)
```
Occasionally this happens, and it is useful to set a single symbol to be pulled from a specific source.

### Save and load symbol lookup table

The previous exercise taught you how to set default arguments on a per-symbol basis, but those settings only last for the current session.

This exercise will teach you how to save and load symbol-based defaults by using saveSymbolLookup() and loadSymbolLookup(), respectively. You can use the file arguments to specify a file to store your defaults.

You can also use the getSymbolLookup() function to check per-symbol defaults before you try to load data using getSymbols().

```{r}
# Save symbol lookup table
saveSymbolLookup("my_symbol_lookup.rda")

# Set default source for CP to "yahoo"
setSymbolLookup("CP" = "yahoo")

# Verify the default source is "yahoo"
getSymbolLookup("CP")

# Load symbol lookup table
loadSymbolLookup("my_symbol_lookup.rda")
getSymbolLookup("CP")
```
This will let you load the same lookup table even if you close out of R.

## Handling instrument symbols that clash or are not valid R names

### Access the object using get() or backticks

At some point, you might download data for an instrument that does not have a syntactically valid name. Syntactically valid names contain letters, numbers, ".", and "_", and must start with a letter or a "." followed by a non-number.

For example, the symbol for Berkshire Hathaway class A shares is "BRK-A", which is not a syntactically valid name because it contains a "-" character. Another example are Chinese stocks, which have symbols composed of numbers. The Yahoo Finance symbol for the SSE Composite Index is "000001.SS".

You can use the get function or backticks (`) to access objects that do not have syntactically valid names.

```{r}
# Load BRK-A data
getSymbols("BRK-A")

# Use backticks and head() to look at the loaded data
head(`BRK-A`)

# Use get() to assign the BRK-A data to an object named BRK.A
BRK.A <- get("BRK-A")
```
Just remember to use backticks or get() if you ever run into invalid characters.

### Create valid name for one instrument

If you are only downloading data for a single symbol and that symbol is not a syntactically valid name, you can set auto.assign = FALSE in your call to getSymbols(). That will allow you to directly assign the output to a syntactically valid name.

You may also want to convert the column names to syntactically valid names. That is a good idea if you plan to use the data in functions that expect column names to be syntactically valid names (e.g. lm()).
```{r}
# Create BRK.A object
BRK.A <- getSymbols("BRK-A", auto.assign = FALSE)

# Create col_names object with the column names of BRK.A
col_names <- colnames(BRK.A)

# Set BRK.A column names to syntactically valid names
colnames(BRK.A) <- make.names(col_names)
```
Now you can fix tricky ticker symbols in the column names of your data.

### Create valid names for multiple instruments

An earlier exercise taught you how to use setSymbolLookup() to set a default data source for getSymbols(). You can also use setSymbolLookup() to create a mapping between the instrument symbol and the name of the R object.

This is useful if you want to download data for a lot symbols that are not syntactically valid names, or symbols that have names that conflict with other R variable names.

An example of a name that conflicts is the symbol for AT&T's stock, T, which is often used as a short form for the logical value TRUE.

To change the name of a given symbol, arguments must be passed to setSymbolLookup() as a list, like so: setSymbolLookup(NEW_NAME = list(name = "OLD_NAME")).
```{r}
# Set name for BRK-A to BRK.A
setSymbolLookup(BRK.A = list(name = "BRK-A"))

# Set name for T (AT&T) to ATT
setSymbolLookup(ATT = list(name = "T"))

# Load BRK.A and ATT data
getSymbols(c("BRK.A", "ATT"))
```
Now you can map troublesome tickers to new names with setSymbolLookup().

# Aligning data with different periodicities

## Making irregular data regular

### Create a zero-width and regular xts object

In order to create regular data from an irregular data set, the first thing you need is a regular sequence of date-times that span the dates of your irregular data set. A "regular" sequence of date-times has equally-spaced time points.

In this exercise, you will use the irregular_xts object to create a zero-width xts object that has a regular daily index. A zero-width xts object has an index of date-times, but no data columns.

```{r include=FALSE}
irregular_xts <- rbind(structure(c(4L, 21L, 1L, 34L), .Dim = c(4L, 1L), index = structure(c(1451692800, 
1451952000, 1452124800, 1452470400), tzone = "UTC", tclass = "Date"), class = c("xts", 
"zoo"), .Dimnames = list(NULL, "data")))
```

```{r}
# Extract the start date of the series
start_date <- start(irregular_xts)

# Extract the end date of the series
end_date <- end(irregular_xts)

# Create a regular date sequence
regular_index <- seq(from = start_date, to = end_date, by = "day")

# Create a zero-width xts object
regular_xts <- xts(seq_along(regular_index), order.by = regular_index)
```
```{r}
regular_xts
```
 Making regular date-time sequences is useful in many time-series applications.

### Use merge to make an irregular index regular

The previous exercise taught you how to make a zero-width xts object with a regular time index. You can use the zero-width object to regularize an irregular xts object.

The regularized series usually has missing values (NA) because the irregular data does not have a value for all observations in the regular index. This exercise will teach you how to handle these missing values when you merge() the two series.

```{r}
# Merge irregular_xts and regular_xts
merged_xts <- merge(irregular_xts, regular_xts)

# Look at the first few rows of merged_xts
head(merged_xts)

# Use the fill argument to fill NA with their previous value
merged_filled_xts <- merge(irregular_xts, regular_xts, fill = na.locf)

# Look at the first few rows of merged_filled_xts
head(merged_filled_xts)
```
Filling forward is a useful operation, but be careful to make sure it is what you want!

## Aggregating to lower frequency

### Aggregate daily data and merge with monthly data

Sometimes two series have the same periodicy, but use different conventions to represent a timestamp. For example, monthly series may be timestamped with the first or last date of the month. The different timestamp convention can cause many NA when series are merged. The yearmon class from the zoo package helps solve this problem.

In this exercise, you will aggregate the FRED daily Fed Funds rate (DFF) to a monthly periodicy and merge it with the FRED monthly Fed Funds rate (FEDFUNDS).The DFF aggregate will be timestamped with the last row of the month, while FEDFUNDS is timestamped with the first day of the month.
```{r}
getSymbols(c("FEDFUNDS", "DFF"), src = "FRED")
```
```{r}
# Aggregate DFF to monthly
monthly_fedfunds <- apply.monthly(DFF, mean, na.rm = TRUE)

# Convert index to yearmon
index(monthly_fedfunds) <- as.yearmon(index(monthly_fedfunds))

# Merge FEDFUNDS with the monthly aggregate
merged_fedfunds <- merge(FEDFUNDS, monthly_fedfunds)

# Look at the first few rows of the merged object
head(merged_fedfunds)
```
You will often need to aggregate to a lower frequency to align multiple time series.

### Align series to first and last day of month

Sometimes you may not be able to use convenience classes like yearmon to represent timestamps. This exercise will teach you how to manually align merged data to the timestamp representation you prefer.

First you merge the lower-frequency data with the aggregate data, then use na.locf() to fill the NA forward (or backward, using fromLast = TRUE). Then you can subset the result using the index of the object with the representation you prefer.

```{r}
# Aggregate DFF to monthly
monthly_fedfunds <- apply.monthly(DFF, mean, na.rm = TRUE)

# Merge FEDFUNDS with the monthly aggregate
merged_fedfunds <- merge(FEDFUNDS, monthly_fedfunds)

# Look at the first few rows of the merged object
head(merged_fedfunds)
```
```{r}
# Fill NA forward
merged_fedfunds_locf <- na.locf(merged_fedfunds)

# Extract index values containing last day of month
aligned_last_day <- merged_fedfunds_locf[index(monthly_fedfunds)]
head(aligned_last_day)
# Fill NA backward
merged_fedfunds_locb <- na.locf(merged_fedfunds, fromLast = TRUE)

# Extract index values containing first day of month
aligned_first_day <- merged_fedfunds_locb[index(FEDFUNDS)]
head(aligned_first_day)
```
Knowing how to manually align merged data will definitely come in handy!

### Aggregate to weekly, ending on Wednesdays

In this exercise, you will learn a general aggregation technique to aggregate daily data to weekly, but with weeks ending on Wednesdays. This is often done in stock market research to avoid intra-week seasonality.

You can supply your own end points to period.apply() (versus using endpoints()). Recall endpoints() returns locations of the last observation in each period specified by the on argument. The first and last elements of the result are always zero and the total number of observations, respectively. The end points you pass to period.apply() must follow this convention.

```{r}
# Extract index weekdays
index_weekdays <- .indexwday(DFF)

# Find locations of Wednesdays
wednesdays <- which(index_weekdays == 3)

# Create custom end points
end_points <- c(0, wednesdays, nrow(DFF))

# Calculate weekly mean using custom end points
weekly_mean <- period.apply(DFF, end_points, mean)
head(weekly_mean)
```
There are many ways to convert a time series to a lower frequency.

## Aggregating and combining intraday data

### Combine data that have timezones

Recall that xts objects store the time index as seconds since midnight, 1970-01-01 in the UTC timezone. merge() uses this underlying index and returns a result with the first object's timezone.

This exercise provides an example. The two objects in your workspace are identical except for the index timezone. The index values are the same instances in time, but measured in different locations. The london object's timezone is Europe/London and the chicago object's timezone is America/Chicago.

```{r include=FALSE}
london <- rbind(structure(1:5, .Dim = c(5L, 1L), index = structure(c(1262757600, 
1263225600, 1263236400, 1263240000, 1263456000), tzone = "Europe/London", tclass = c("POSIXct", 
"POSIXt")), class = c("xts", "zoo"), .Dimnames = list(NULL, "London")))

chicago <- rbind(structure(1:5, .Dim = c(5L, 1L), index = structure(c(1262757600, 
1263225600, 1263236400, 1263240000, 1263456000), tzone = "America/Chicago", tclass = c("POSIXct", 
"POSIXt")), class = c("xts", "zoo"), .Dimnames = list(NULL, "Chicago")))
```
```{r}
# Create merged object with a Europe/London timezone
tz_london <- merge(london, chicago)

# Look at tz_london structure
str(tz_london)

# Create merged object with a America/Chicago timezone
tz_chicago <- merge(chicago, london)

# Look at tz_chicago structure
str(tz_chicago)
```
### Make irregular intraday-day data regular

Earlier you learned how to create a regular daily series from irregular daily data. Now you will create regular intra-day data from an irregular series.

Intra-day financial data often does not span a full 24 hour period. Most markets are usually closed at least part of the day. This exercise assumes markets open at 9AM and close at 4PM Monday-Friday.

Your data may not have an observation exactly at the market open and/or close. So, you would not be able to use start() and end() as you could with the daily data. You need to specify the start and end date-times to create this regular sequence.

The regular date-time sequence will include periods when markets are closed, but you can use xts' time-of-day subsetting to extract periods the market is open.
```{r include=FALSE}
irregular_xts <- rbind(structure(1:20, .Dim = c(20L, 1L), index = structure(c(1262606400, 
1262613600, 1262628000, 1262649600, 1262653200, 1262660400, 1262682000, 
1262689200, 1262746800, 1262761200, 1262786400, 1262797200, 1262818800, 
1262851200, 1262883600, 1262887200, 1262905200, 1262919600, 1262934000, 
1262937600), tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", 
"zoo"), .Dimnames = list(NULL, "data")))
```

```{r}
# Create a regular date-time sequence
regular_index <- seq(as.POSIXct("2010-01-04 09:00"), as.POSIXct("2010-01-08 16:00"), by = "30 min")

# Create a zero-width xts object
regular_xts <- xts(x = NULL, order.by = regular_index)

# Merge irregular_xts and regular_xts, filling NA with their previous value
merged_xts <- merge(irregular_xts, regular_xts, fill = na.locf)

# Subset to trading day (9AM - 4PM)
trade_day <- merged_xts["T09:00/T16:00"]
trade_day
```
Now you know how to subset your intra-day data to only contain the trading day!

### Fill missing values by trading day

The previous exercise carried the last observation of the prior day forward into the first observation of the following day. This exercise will show you how to fill missing values by trading day, without using the prior day's final value.

You will use the same split-lapply-rbind paradigm from the Introduction to xts and zoo course. For reference, the pattern is below.

    x_split <- split(x, f = "months")
    x_list <- lapply(x_split, cummax)
    x_list_rbind <- do.call(rbind, x_list)

Recall that the do.call(rbind, ...) syntax allows you to pass a list of objects to rbind() instead of having to type all their names.

```{r include=FALSE}
trade_day <- rbind(structure(c(NA, NA, NA, NA, NA, NA, 1L, NA, NA, NA, 2L, NA, NA, 
NA, NA, 7L, NA, NA, NA, 8L, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11L, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), class = c("xts", 
"zoo"), index = structure(c(1262595600, 1262597400, 1262599200, 
1262601000, 1262602800, 1262604600, 1262606400, 1262608200, 1262610000, 
1262611800, 1262613600, 1262615400, 1262617200, 1262619000, 1262620800, 
1262682000, 1262683800, 1262685600, 1262687400, 1262689200, 1262691000, 
1262692800, 1262694600, 1262696400, 1262698200, 1262700000, 1262701800, 
1262703600, 1262705400, 1262707200, 1262768400, 1262770200, 1262772000, 
1262773800, 1262775600, 1262777400, 1262779200, 1262781000, 1262782800, 
1262784600, 1262786400, 1262788200, 1262790000, 1262791800, 1262793600, 
1262854800, 1262856600, 1262858400, 1262860200, 1262862000, 1262863800, 
1262865600, 1262867400, 1262869200, 1262871000, 1262872800, 1262874600, 
1262876400, 1262878200, 1262880000, 1262941200, 1262943000, 1262944800, 
1262946600, 1262948400, 1262950200, 1262952000, 1262953800, 1262955600, 
1262957400, 1262959200, 1262961000, 1262962800, 1262964600, 1262966400
), tzone = "", tclass = c("POSIXct", "POSIXt")), .Dim = c(75L, 
1L), .Dimnames = list(NULL, "data")))
```

```{r}
# Split trade_day into days
daily_list <- split(trade_day , f = "days")

# Use lapply to call na.locf for each day in daily_list
daily_filled <- lapply(daily_list, FUN = na.locf)

# Use do.call to rbind the results
filled_by_trade_day <- do.call(rbind, daily_filled)
filled_by_trade_day
```
You used advanced functions to transform data for each trading day!

### Aggregate irregular intraday-day data

Intraday data can be huge, with hundreds of thousands of observations per day, millions per month, and hundreds of millions per year. These data sets often need to be aggregated before you can work with them.

You learned how to aggregate daily data in the Introduction to xts and zoo course. This exercise will use to.period() to aggregate intraday data to an OHLC series. You often need to specify both period and k arguments to aggregate intraday data.

```{r}
load("DC.RData")

dc_intraday <- DC[,1]
```
```{r}
# Convert raw prices to 5-second prices
xts_5sec <- to.period(dc_intraday, period = "seconds", k = 5)
head(xts_5sec)
# Convert raw prices to 10-minute prices
xts_10min <- to.period(dc_intraday, period = "minutes", k = 10)
head(xts_10min)
# Convert raw prices to 1-hour prices
xts_1hour <- to.period(dc_intraday, period = "hours", k = 1)
head(xts_1hour)
```
# Importing text data, and adjusting for corporate actions

## Importing text files

### Import well-formatted daily OHLC data

You can use getSymbols() to import a well-formatted CSV. In this case, well-formatted means the file contains data for a single instrument with date, open, high, low, close, volume, and adjusted close columns, in that order. You might have noticed that this is the same format as getSymbols() returns when you download data from internet sources.

getSymbols() allows you to use a directory of CSV files as a source (like Yahoo Finance and FRED). In this exercise, you will be using AMZN.csv in your working directory. It contains some randomized Amazon.com data from the first half of 2002. You can use dir() to see the file in your working directory.
```{r}
# Load AMZN.csv
getSymbols("AMZN", src = "csv")

# Look at AMZN structure
str(AMZN)
```
### Import text files in other formats

The previous exercise taught you how to import well-formatted CSV data using getSymbols(). Unfortunately, most data are not well-formatted.

The zoo package provides several functions to import text files as zoo objects. The main function is read.zoo(), which wraps read.table(). The xts class extends zoo, so you can easily convert the result of read.zoo() into an xts object by using as.xts().

```{r}
# Import AMZN.csv using read.zoo
amzn_zoo <- read.zoo("AMZN.csv", sep = ",", header = TRUE)

# Convert to xts
amzn_xts <- as.xts(amzn_zoo)

# Look at the first few rows of amzn_xts
head(amzn_xts)
```
As you will see, read.zoo() is a very flexible import function for time series.

### Handle date and time in separate columns

read.zoo() makes it easy to import data when the date and time are in separate columns. The index.column argument allows you to specify the name or number of the column(s) containing the index data. That's all you need to do if the date and time are specified in the standard format ("%Y-%m-%d" for date, and "%H:%M:%S" for time).

In this exercise, you will use the index.column argument to specify the date and time columns of the file. Your working directory has a file named UNE.csv that contains some 5-minute OHLC data for the energy company, Unron. You will still use read.csv() find the column names of the date and time columns.
```{r}
# Read data with read.csv
une_data <- read.csv("UNE.csv", nrows = 5)

# Look at the structure of une_data
str(une_data)
```
```{r}
# Read data with read.zoo, specifying index columns
une_zoo <- read.zoo("UNE.csv", index.column = c("Date", "Time"), sep = ",", header = TRUE)

# Look at first few rows of data
head(une_zoo)
```
The index.column argument is great if your dates and times are in separate columns!

### Read text file containing multiple instruments

The previous exercises work if each file contains only one instrument. Some software and data vendors may provide data for all instruments in one file. This exercise will teach you how to import a file that contains multiple instruments.

Once again, you can use read.zoo(). This time you will be using its split argument, which allows you to specify the name or number of the columns(s) that contain the variables that identify unique observations.

The two_symbols.csv file in your working directory contains bid/ask data for two instruments, where each row has one bid or ask observation for one instrument. You will use the split argument to import the data into an object that has both bid and ask prices for both instruments on one row.
```{r}
# Read data with read.csv
two_symbols_data <- read.csv("two_symbols.csv", nrows = 5)

# Look at the structure of two_symbols_data
str(two_symbols_data)
```
```{r}
# Read data with read.zoo, specifying index columns
two_symbols_zoo <- read.zoo("two_symbols.csv", split = c("Symbol", "Type"), sep = ",", header = TRUE)

# Look at first few rows of data
head(two_symbols_zoo)
```
## Checking for weirdness

### Handle missing values

In chapter 3, you used na.locf() to fill missing values with the previous non-missing value. You can use interpolation when carrying the previous value forward isn't appropriate. In this exercise, you will explore two interpolation methods: linear and spline.

Linear interpolation calculates values that lie on a line between two known data points. This is a good choice for fairly linear data, like a series with a strong trend. Spline interpolation is more appropriate for series without a strong trend, because it calculates a non-linear approximation using multiple data points.

Use these two methods to interpolate the three missing values for the 10-year Treasury rate in the object DGS10. Then compare the results with the output of na.locf().

```{r include=FALSE}
DGS10 <- rbind(structure(c(4.94, 4.85, 4.78, 4.79, 4.85, NA, 4.99, 4.97, 4.86, 
4.8, 4.84, NA, NA, 4.64, 4.57, 4.63), class = c("xts", "zoo"), src = "FRED", updated = structure(1595959789.29678, class = c("POSIXct", 
"POSIXt")), index = structure(c(998870400, 998956800, 999043200, 
999129600, 999216000, 999475200, 999561600, 999648000, 999734400, 
999820800, 1000080000, 1000166400, 1000252800, 1000339200, 1000425600, 
1000684800), tzone = "UTC", tclass = "Date"), .Dim = c(16L, 1L
), .Dimnames = list(NULL, "DGS10")))
```

```{r}
# fill NA using last observation carried forward
locf <- na.locf(DGS10)

# fill NA using linear interpolation
approx <- na.approx(DGS10)

# fill NA using spline interpolation
spline <- na.spline(DGS10)

# merge into one object
na_filled <- merge(locf, approx, spline)

# plot combined object
plot(na_filled, col = c("black", "red", "green"))
```
### Visualize imported data

It's important to check your imported data is reasonable. A plot is a quick and easy way to spot oddities. In this exercise, you will use the plot() function to visualize some AAPL data from Yahoo Finance.

A stock split caused a huge price change in June 2014. Apple simultaneously increased the number of shares outstanding and decreased its stock price, leaving the company value unchanged. For example, a 2-for-1 split would double the shares outstanding, and reduce the stock price by 1/2.

You will also use the quantmod extractor functions Cl() and Ad() to access the close and adjusted close columns, respectively. Yahoo Finance provides the split- and/or dividend-adjusted close column.
```{r}
getSymbols("AAPL", src='yahoo', from = "2007-01-01", to = "2017-09-17")
```
```{r}
head(AAPL)
```
```{r}
# Look at the last few rows of AAPL data
tail(AAPL)

# Plot close price
plot(AAPL$AAPL.Close)

# Plot adjusted close price
plot(AAPL$AAPL.Adjusted)
```
### Cross reference sources

In this exercise, you will cross-reference the AAPL raw price data from the previous exercise with AAPL data from another source.

The new data is already adjusted for splits, but not dividends. So the close prices from the new data won't align closely with the adjusted close prices from the previous exercise (which are adjusted for both splits and dividends). You will learn more about the adjustment process in the next video.

You will compare raw, unadjusted AAPL data with split-adjusted AAPL data. The data have already been loaded to your workspace in aapl_raw and aapl_split_adjusted, respectively.

    # Look at first few rows of aapl_raw
    head(aapl_raw)
    
    # Look at first few rows of aapl_split_adjusted
    head(aapl_split_adjusted)
    
    # Plot difference between adjusted close and split-adjusted close
    plot(Ad(aapl_raw$AAPL.Adjusted) - Cl(aapl_split_adjusted$AAPL.Close))
    
    # Plot difference between volume from the raw and split-adjusted sources
    plot(Vo(aapl_raw$AAPL.Volume) - Vo(aapl_split_adjusted$AAPL.Volume))

The volumes agree on most (but not all) days, whereas the close prices are completely different.

## Adjusting for corporate actions

### Adjust for stock splits and dividends

Stock splits can create large historical price changes even though they do not change the value of the company. So, you must adjust all pre-split prices in order to calculate historical returns correctly.

Similarly, you must adjust all pre-dividend prices. Dividends do reduce the company's value by the amount of the dividend payment, but the investor's return isn't affected because they receive the offsetting dividend payment.

In this exercise, you will learn how to use the adjustOHLC() function to adjust raw historical OHLC prices for splits and dividends, so historical returns can be calculated accurately.

Yahoo Finance provides raw prices and a split- and dividend-adjusted close column. The output of adjustOHLC() should match Yahoo's adjusted close column. AAPL data from Yahoo Finance is already loaded in your workspace.

While not necessary to complete this exercise, Yahoo Finance provides an [accessible example](https://help.yahoo.com/kb/finance/SLN2311.html) of the adjusted close calculation, if you're interested in learning more.

```{r}
# Look at first few rows of AAPL
head(AAPL)

# Adjust AAPL for splits and dividends
aapl_adjusted <- adjustOHLC(AAPL)

# Look at first few rows of aapl_adjusted
head(aapl_adjusted)
```
### Download split and dividend data
In the previous exercise, you used adjustOHLC() to adjust raw historical OHLC prices for splits and dividends, but it only works for OHLC data. It will not work if you only have close prices, and it does not return any of the split or dividend data it uses.

You need the dates and values for each split and dividend to adjust a non-OHLC price series, or if you simply want to analyze the raw split and dividend data.

You can download the split and dividend data from Yahoo Finance using the quantmod functions getSplits() and getDividends(), respectively. The historical dividend data from Yahoo Finance is adjusted for splits. If you want to download unadjusted dividend data, you need to set split.adjust = FALSE in your call to getDividends().

```{r}
# Download AAPL split data
splits <- getSplits("AAPL")

# Download AAPL dividend data
dividends <- getDividends("AAPL")

# Look at the first few rows of dividends
head(dividends)

# Download unadjusted AAPL dividend data
raw_dividends <- getDividends("AAPL", split.adjust = FALSE)

# Look at the first few rows of raw_dividends
head(dividends)
```
It's important to get splits and dividends correct when calculating historical returns.

### Adjust univariate data for splits and dividends

If you only have close prices, you can adjust them with adjRatios(). It has 3 arguments: splits, dividends, and close. It returns an xts object with split and dividend adjustment ratios in columns "Split" and "Div", respectively.

You need to provide split data via the splits argument to calculate the split ratio. To calculate the dividend ratio, you need to provide raw dividends and raw prices via the dividends and close arguments, respectively.

Once you have the split and dividend adjustment ratios, you calculate the adjusted price multiplying the unadjusted price by both the split and dividend adjustment ratios.

```{r}
# Calculate split and dividend adjustment ratios
ratios <- adjRatios(splits = splits, dividends = raw_dividends, close = Cl(AAPL))

# Calculate adjusted close for AAPL
aapl_adjusted <- Cl(AAPL) * ratios[, "Split"] * ratios[, "Div"]

# Look at first few rows of Yahoo adjusted close
head(Ad(AAPL))

# Look at first few rows of aapl_adjusted
head(aapl_adjusted)
```

.