ARIMA Model
Understanding ARIMA Model
ARIMA stands for AutoRegressive Integrated Moving Average.
AutoRegressive:
- These are time series models that predict values based on their past values.
- For example, if today is sunny, we might predict that tomorrow will also be sunny. Here, we are predicting the weather based on its past values.
Integrated:
- ARIMA models work with stationary data, where the mean and standard deviation remain relatively constant over time.
- Suppose we have the stock price of Tesla, which is increasing every day. We can say that it is not stationary since its mean keeps increasing.
- To make such a series stationary, we can subtract the current value from the previous value. This transforms the series into a stationary one.
Moving Average:
- Moving averages help identify short-term trends in data by calculating the average of data within a specific timeframe.
Example:
Imagine you’re a weather enthusiast recording daily temperatures in your city for a year. You notice that temperatures vary significantly from day to day but also follow a warming trend over time.
- AutoRegressive (AR): Today’s temperature is likely influenced by the temperatures of the past few days. If recent days were warm, today’s temperature might also be warmer.
- Integrated (I): Over the year, temperatures are generally rising. To remove this trend, you subtract each day’s temperature from the previous day’s temperature, creating a dataset showing temperature changes day by day.
- Moving Average (MA): You calculate the average of these daily temperature changes over short periods. This helps understand temperature fluctuations from the average on a daily basis.
ARIMA Model Usage:
# Example code for using ARIMA
model = ARIMA(stock_prices, order=(p, d, q))
model_fit = model.fit()
forecast = model_fit.forecast(steps=30)
Understanding ARIMA Parameters
The key points to understand here are p, d, and q. Let’s delve into their meanings: 👓
- p: Autoregressive Order. This is the number of past observations considered for making future predictions. ⌛
- q: Moving Average Order. It accounts for a specific number of previous residuals when making future predictions. 📊
- d: Integration Order. It determines the number of differences needed to make the time series stationary (T(q) — T(q-1)).
For further clarity and practical implementation, I will add Kaggle notebook that explains these concepts in detail in comment soon. The notebook will also guide you through methods to identify suitable values for these parameters. 📓
Pros of ARIMA
- It can capture Short-Term trend(using Auto-regressive Order) as well as Long-Term trends(using Moving Average Order)
- ARIMA model does not make any assumption about data distribution, so it can work well with any distribution
- ARIMA model has small number of parameter, so there it is easy to use as less time waste to find best turned-parameters
Cons of ARIMA
- ARIMA does not perform well with complex data pattern, it assume that relationship between future values and past value is linear or near linear.
- ARIMA takes lot of time to train on huge dataset as compare to other time series models.
- ARIMA can be sensitive to outliers and extreme values, potentially leading to inaccurate forecasts.
When and When not to use ARIMA
- When data is stationary(where the statistical properties like mean and variance don’t change over time.) ARIMA perform well, if data is not stationary then we can use differencing to make it stationary.
- If data exhibits clear trends and regular seasonality, ARIMA can be a good choice.
- ARIMA’s forecasting accuracy tends to reduce as the forecast horizon increases. So it is not good for long-term forecasts.
- If data exhibits complex non-linear patterns that ARIMA might struggle to capture the pattern.
Thank You
Thank you for taking the time to read this article.
I value your feedback! If you have any comments or questions, please feel free to share them with me on comment or email me directly.