2/14/25, 12:02 AM ISYE 6402 Homework 4
ISYE 6402 Homework 4
Background
For this data analysis, you will analyze the daily and weekly domestic passenger count arriving in Hawaii airports.
File DailyDomestic.csv contains the daily number of passengers between May 2019 and February 2023 File
WeeklyDomestic.csv contains the weekly number of passengers for the same time period. Here we will use
diferent ways of ftting the ARIMA model while dealing with trend and seasonality.
library(lubridate)
library(mgcv)
library(tseries)
library(car)
Instructions on reading the data
To read the data in R , save the fle in your working directory (make sure you have changed the directory if
diferent from the R working directory) and read the data using the R function read.csv()
daily <- read.csv("DailyDomestic.csv", head = TRUE)
daily$date <- as.Date(daily$date)
weekly <- read.csv("WeeklyDomestic.csv", head = TRUE)
weekly$week <- as.Date(weekly$week)
Question 1. Trend and seasonality estimation
1a. Plot the daily and weekly domestic passenger count separately. Do you see a strong trend and seasonality?
# plot daily time series
daily_ts = ts(daily$domestic, start= decimal_date(ymd("2019-05-01")) , frequency = 365)
ts.plot(daily_ts, ylab="domestic_passenger_count", main="Daily Data")
1/21
,2/14/25, 12:02 AM ISYE 6402 Homework 4
# plot weekly time series
weekly_ts = ts(weekly$domestic, start= decimal_date(ymd("2019 -05-05")) , frequency = 52)
ts.plot(weekly_ts, ylab="domestic_passenger_count", main="Weekly Data")
Response
There seem to be some strong seasonality There are cyclical patterns with roughly yearly cycle. There are 2
roughly 2 peaks: one at the beginning of the year, the other at the end of the year. There is a slight upward trend
observed as well, but the trend is not strong.
1b. (Trend and seasonality) Fit the weekly domestic passenger count with a non-parametric trend using splines
and monthly seasonality using ANOVA. Is the seasonality signifcant? Plot the ftted values together with the
original time series. Plot the residuals and the ACF of the residuals. Comment on how the model fts and on the
2/21
, 2/14/25, 12:02 AM ISYE 6402 Homework 4
appropriateness of the stationarity assumption of the residuals.
# x-axis points converted to 0-1 scale, common in nonparametric
regression time.pts = c(1:length(weekly_ts))
time.pts = c(time.pts - min(time.pts))/max(time.pts)
# splines Trend Estimation
weekly_month <- as.factor(month(weekly$week))
weekly.gam.fit = gam(weekly_ts~s(time.pts)+weekly_month)
weekly_ts.fit.gam = ts(fitted(weekly.gam.fit), start=decimal_date(ymd("2019-05-
05")),fre quency=52)
ts.plot(weekly_ts, ylab="domestic_passenger_count")
lines(weekly_ts.fit.gam, lwd=2, col = 'purple')
# calculate residual
weekly_ts.dif.gam = ts((weekly_ts - weekly_ts.fit.gam), start=decimal_date(ymd("2019-
05-05")),frequency=52)
# plot residual
ts.plot(weekly_ts.dif.gam,ylab="residual")
3/21
ISYE 6402 Homework 4
Background
For this data analysis, you will analyze the daily and weekly domestic passenger count arriving in Hawaii airports.
File DailyDomestic.csv contains the daily number of passengers between May 2019 and February 2023 File
WeeklyDomestic.csv contains the weekly number of passengers for the same time period. Here we will use
diferent ways of ftting the ARIMA model while dealing with trend and seasonality.
library(lubridate)
library(mgcv)
library(tseries)
library(car)
Instructions on reading the data
To read the data in R , save the fle in your working directory (make sure you have changed the directory if
diferent from the R working directory) and read the data using the R function read.csv()
daily <- read.csv("DailyDomestic.csv", head = TRUE)
daily$date <- as.Date(daily$date)
weekly <- read.csv("WeeklyDomestic.csv", head = TRUE)
weekly$week <- as.Date(weekly$week)
Question 1. Trend and seasonality estimation
1a. Plot the daily and weekly domestic passenger count separately. Do you see a strong trend and seasonality?
# plot daily time series
daily_ts = ts(daily$domestic, start= decimal_date(ymd("2019-05-01")) , frequency = 365)
ts.plot(daily_ts, ylab="domestic_passenger_count", main="Daily Data")
1/21
,2/14/25, 12:02 AM ISYE 6402 Homework 4
# plot weekly time series
weekly_ts = ts(weekly$domestic, start= decimal_date(ymd("2019 -05-05")) , frequency = 52)
ts.plot(weekly_ts, ylab="domestic_passenger_count", main="Weekly Data")
Response
There seem to be some strong seasonality There are cyclical patterns with roughly yearly cycle. There are 2
roughly 2 peaks: one at the beginning of the year, the other at the end of the year. There is a slight upward trend
observed as well, but the trend is not strong.
1b. (Trend and seasonality) Fit the weekly domestic passenger count with a non-parametric trend using splines
and monthly seasonality using ANOVA. Is the seasonality signifcant? Plot the ftted values together with the
original time series. Plot the residuals and the ACF of the residuals. Comment on how the model fts and on the
2/21
, 2/14/25, 12:02 AM ISYE 6402 Homework 4
appropriateness of the stationarity assumption of the residuals.
# x-axis points converted to 0-1 scale, common in nonparametric
regression time.pts = c(1:length(weekly_ts))
time.pts = c(time.pts - min(time.pts))/max(time.pts)
# splines Trend Estimation
weekly_month <- as.factor(month(weekly$week))
weekly.gam.fit = gam(weekly_ts~s(time.pts)+weekly_month)
weekly_ts.fit.gam = ts(fitted(weekly.gam.fit), start=decimal_date(ymd("2019-05-
05")),fre quency=52)
ts.plot(weekly_ts, ylab="domestic_passenger_count")
lines(weekly_ts.fit.gam, lwd=2, col = 'purple')
# calculate residual
weekly_ts.dif.gam = ts((weekly_ts - weekly_ts.fit.gam), start=decimal_date(ymd("2019-
05-05")),frequency=52)
# plot residual
ts.plot(weekly_ts.dif.gam,ylab="residual")
3/21