Background
Individuals stock prices tend to exhibit high amounts of non-constant variance, and thus ARIMA models build upon that data
would likely exhibit non-constant variance in residuals. In this problem we are going to analyze the Apple stock price data from
August 2013 through end of July 2023. We will use the ARIMA-GARCH to model daily and weekly stock price (adjusted close
price at the end of a day for daily data or at the end of the week for weekly data), with a focus on the behavior of its volatility as
well as forecasting both the price and the volatility.
##Data import and cleaning
## Libraries used within this homework are uploaded here
library(zoo,warn.conflicts=FALSE)
library(lubridate,warn.conflicts=FALSE)
library(mgcv,warn.conflicts=FALSE)
library(rugarch,warn.conflicts=FALSE)
#importing the data
dailydata <- read.csv("DailyAAPL.csv", head = TRUE)
weeklydata <- read.csv("WeeklyAAPL.csv", head = TRUE)
#cleaning the data
#dates to date format
weeklydata$Date<-as.Date(weeklydata$Date,format='%m/%d/%y')
dailydata$Date<-as.Date(dailydata$Date,format='%m/%d/%y')
#prices to timeseries format
AAPLWeekly <- ts(weeklydata$Close,start=c(2013,8,1),freq=52)
AAPLDaily <- ts(dailydata$Close,start=c(2013,8,1),freq=252)
Question 1: Exploratory Data Analysis (20 points)
1a. Based on your intuition, when would you use daily vs weekly stock price data? It would be better to use daily stock price
data for short-term forecasting and granular analysis of the volatility of prices over shorter time intervals. We would want to use
weekly stock price data to analysis longer intervals to capture any cyclical cycles.
1b. Plot the time series plots comparing daily vs weekly data. How do the daily vs weekly time series data compare?
par(mfrow=c(1,1))
plot(dailydata$Date, dailydata$Close, type = 'l', col = 'black', xlab = "Date", ylab = "Stock Pric
e")
lines(weeklydata$Date, weeklydata$Close, type = 'l', col = 'blue')
legend("topleft", legend=c("Daily","Weekly"), fill=c("black","blue"))
,Response: Weekly vs Monthly Time Series data comparison Apple stock price appears to be more volatile in recent years. The
daily price data shows more fluctuations and variations in the stock prices compared to the weekly data which is slightly
smoother. The volatility in price is significant enough to consider the weekly price data as sufficient for analyzing trends,
stationarity and residual analysis.
1c. Fit a non-parametric trend using splines regression to both the daily and weekly time series data. Overlay the fitted trends.
How do the trends compare?
Analyzing weekly and daily data with trend fitting
par(mfrow = c(1,1))
time.ptsd = c(1:length(AAPLDaily))
time.ptsd = c(time.ptsd - min(time.ptsd))/max(time.ptsd)
egam.fit.daily <- gam(AAPLDaily ~ s(time.ptsd))
eu.fit.gam.daily <- ts(fitted(egam.fit.daily), start=c(2013,8,1), freq = 252)
plot.ts(AAPLDaily, lwd = 2, col = "black", ylab = 'Stock Price', main = "Daily Apple Stock Price"
)
lines(eu.fit.gam.daily, lwd = 2, col = "red")
time.ptsw = c(1:length(AAPLWeekly))
time.ptsw = c(time.ptsw - min(time.ptsw))/max(time.ptsw)
egam.fit.weekly <- gam(AAPLWeekly ~ s(time.ptsw))
eu.fit.gam.weekly <- ts(fitted(egam.fit.weekly), start=c(2013,8,1),freq = 52)
lines(eu.fit.gam.weekly, lwd = 2, col = "blue")
legend("topleft", c("Daily", "Weekly"), col = c("red", "blue"), lwd = 2)
, plot(AAPLWeekly, lwd = 2, col = "black", ylab = 'Stock Price', main = "Weekly Apple Stock Price")
lines(eu.fit.gam.daily, lwd = 2, col = "red")
lines(eu.fit.gam.weekly, lwd = 2, col = "blue")
legend("topleft", c("Daily", "Weekly"), col = c("red", "blue"), lwd = 2)
Response: Weekly vs Monthly Time Series data trend fit The weekly data seems to encapsulate the trend fit slightly better than
the daily data. However, both models capture the slight fluctuations from 2013 - 2020 and the upward trend onward.
1d. Consider the return stock price computed as provided in the canvas homework assignment. Apply this formula to compute
the return price based on the daily and weekly time series data. Plot the return time series and their corresponding ACF plots.
How do the return time series compare in terms of stationarity and serial dependence?
Analyzing weekly and daily return data and comparing with original data
, par(mfrow = c(2,2),mar=c(3,3,3,3))
dailydata$return <-c(diff(dailydata$Close)/dailydata$Close[-length(dailydata$Close)], NA)
datadailydiff=diff(AAPLDaily)
returndailydata=datadailydiff/AAPLDaily[-length(AAPLDaily)]
ts.plot(returndailydata, lwd = 2, col = "blue", main = "Daily Return Data")
acf(returndailydata, lag.max = 6*25, main = "ACF plot of Daily Return Data")
weeklydata$return <-c(diff(weeklydata$Close)/weeklydata$Close[-length(weeklydata$Close)], NA)
dataweeklydiff=diff(AAPLWeekly)
returnweeklydata=dataweeklydiff/AAPLWeekly[-length(AAPLWeekly)]
ts.plot(returnweeklydata, lwd = 2, col = "red", main = "Weekly Return Data")
acf(returnweeklydata, lag.max = 6*25, main = "ACF plot of Weekly Return Data")
Response: Return series vs price series analysis According to the daily and weekly return time series plot, there appears to be
some indication of non-constant mean with several instances of the large volatility. The ACF indicate that the return series are
stationary as lag = 0 and both ACF plots look like white noise with no serial dependence.
#Question 2: ARIMA(p,d,q) for Stock Price (20 Points)
2a. Divide the data into training and testing data set, where the training data exclude the last week of data (July 20th-July 24th
with the testing data including the last week of data. Apply the iterative model to fit an ARIMA(p,d,q) model with max AR and
MA orders of 7 and difference orders 1 and 2 separately to the training datasets of the daily and weekly data. Display the
summary of the final model fit.