An epidemic model of foreclosures during the subprime mortgage crisis of 2008
1. AN EPIDEMIC MODEL OF FORECLOSURES
DURING THE SUBPRIME MORTGAGE CRISIS OF 2008
NATHANIEL BROWN
Master's Exam Committee:
Loren Cobb, Chair
Audrey Hendricks
Burt Simon
Department of Mathematical and Statistical Sciences
University of Colorado Denver
Date: Presented November 6, 2015. Revised December 15, 2015.
2. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 1
1. Introduction and Background
The subprime mortgage crisis in the United States was the result of declining property
values and mortgage foreclosures, and led to massive investor losses between 2007 and 2010.
A housing price bubble formed in the years leading up to the crisis as easy access to credit
allowed buyers to inate prices. As investors backed out of the market, credit became
harder to come by and home prices dropped. This drop in prices prevented borrowers
from renancing before their adjustable rate mortgages reset, which made loan payments
unaordable and forced the borrower into default and foreclosure.
For decades before the subprime mortgage crisis, it was common for a lender to package
hundreds or even thousands of loans into an investment vehicle called a Mortgage Backed
Security (MBS). The lender could then sell these securities to investors, which would open
up more capital for lending. At the height of the bubble, investors were buying up MBS as
quickly as lenders could produce them. This led to looser underwriting standards as lenders
tried to keep up with the demand for more MBS.
In addition to easy credit, in the early 2000s lenders started to calculate the risk of MBS
using a Gaussian copula function, which assumed that default risks were independent across
properties. This method for determining the risks of MBS quickly took o and was being used
by investors, banks, ratings agencies, and regulators who were under the assumption that the
risks were under control.[8]
Unfortunately, the market did not behave as expected leading up
to the mortgage crisis, and it would appear as though the independence assumption broke
down, resulting in faulty risk models.
The relaxed underwriting allowed borrowers who were previously ineligible for a mort-
gage the opportunity to own a home. These borrowers were typically considered subprime,
which is a debtor that might have more diculty keeping up with payments than a higher
qualied borrower. They tended to have lower credit scores, inconsistent income, or excessive
preexisting debt. There was a rise in subprime lending from a historical market share below
8% in 2004 to about 20% in 2006.[3]
With more buyers in the market and loans easier to
receive, home prices began to climb.
3. 2 NATHANIEL BROWN
Since subprime lending takes on more risk, the loans come with higher interest rates.
The higher rates made Adjustable Rate Mortgages (ARM) more attractive with their lower
initial rates. The majority of subprime mortgages originated during this period were ARM
loans, with about 90% of subprime loans being ARMs in 2006.[4]
The borrowers often ex-
pected to be able to renance into a lower interest rate before their ARM loan reset.
However, investors began to detect that the quality of the securities was lower than
initially suspected. Purchases of MBS declined and lenders became more strict in their re-
quirements for mortgage qualication. The increased diculty in qualifying for a mortgage
meant that there was not as much competition among buyers, which resulted in property
values facing a severe decline. This decline in property values put many borrowers under-
water, which is dened as a home being worth less than the remaining balance on the loan.
Since lenders generally do not want the risk associated with an unsecured loan, underwater
mortgages are dicult or impossible to renance.
Without the ability to renance, ARM loans began to reset to higher interest rates.
Higher rates led to higher payments, which became increasingly dicult for some borrowers
to maintain. If a borrower is unable to make a payment and it becomes more than 30 days
past due, the loan is said to be in default. When in default, the holder of the loan may
initiate foreclosure proceedings.
After the housing market bubble, foreclosures spread across the country. As of Septem-
ber 2012, approximately 1.4 million homes, or 3.3 percent of all homes with a mortgage, were
in the national foreclosure inventory.[5]
These represent homes that were in the foreclosure
process, but had not yet completed the process or cured the defaulted debt.
4. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 3
2. Motivation Behind the Model
A typical epidemic model is used to describe the process in which a disease spreads
through a population. It is assumed that an individual with a communicable disease will
spread it to susceptible individuals at a given rate proportional to the susceptible and in-
fected populations. After the disease has run its course, the infected individual is generally
considered recovered, either due to immunization or death, and is unable to transmit the
infection to the susceptible. In this way, we have a model that focuses on a population and
moves individuals through three stages: the susceptible, the infected, and the recovered.
Foreclosures act in a similar fashion. A foreclosure in a neighborhood lowers the prop-
erty values in the area, which can put the properties underwater. For borrowers looking to
renance, they will nd it impossible as they do not have enough equity in the property to
justify the new loan. Many borrowers stuck in adjustable loans nd the higher payments un-
aordable and might default and let the property go to foreclosure. In this way, foreclosure
can be thought of as an infection, so we will explore the possibility that it might be modeled
using the same methods.
The data we will be exploring is the historical foreclosure rate of the City and County
of Denver. Along with about half the states in the nation, Colorado has a foreclosure process
that does not involve taking a case to court. What follows is a smooth foreclosure process
that has a relatively consistent time line. This follows the assumption of an epidemic model
that an individual will be infected for a set amount of time before being moved into the
recovered category.
If we can use an epidemic model to describe foreclosures in a population, there are
several benets to be considered. If there is a sudden increase in foreclosures, we might be
able to identify whether the surge will die out or if it will turn into another crisis. If there
is a crisis, we should be able to estimate how long the crisis will last before the foreclosure
rate falls back to historic levels.
5. 4 NATHANIEL BROWN
3. The Data
We will examine foreclosures in the County of Denver, in Colorado, from 1997 through
2014. The Denver County Oce of the Clerk and Recorder releases foreclosure information,
annually since 1978 and monthly since 2008.[1]
Since we are interested in foreclosures leading
up to the subprime crisis as well as following it, we will look at the aggregated annual
loans in foreclosure, reported as foreclosure starts. In Figure 1 below, we see the number
of foreclosures by year. Looking at historical data, it would appear as though there is a
base foreclosure rate of 830 mortgages in foreclosure, which will be considered the rate of
foreclosure in a healthy economy with no dependence between mortgage defaults.
Figure 1. Foreclosures in Denver County by Year
For the more advanced epidemic model that includes births and deaths, we will use
originations and liquidations of mortgages within Denver County. The number of originations
6. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 5
in a given year was taken from the county's public records as the number of Deeds of Trust
that were recorded in that year.[2]
Similarly, the liquidations will be represented by the
number of Releases of Deed of Trust that were recorded for a given year. In this way,
we will be able to include how many mortgages are entering or leaving the population. A
chart of originations and liquidations may be seen in Figure 2 below. The origination data
does not distinguish between subprime mortgages and higher-quality loans. Similarly, the
release records do not provide the reason for the mortgage closing, which could be through
foreclosure or short sale, or more natural means such as the owner selling the home or paying
it o.
Figure 2. Originations and Liquidations in Denver County by Year
7. 6 NATHANIEL BROWN
4. Method
The rst epidemic model we encounter is a system of dierential equations that de-
scribe how individuals in a population move from one compartment into another. The com-
partments we will examine are susceptible (S), infected (I), and recovered (R). Susceptible
individuals are those that are not in foreclosure, the infected are those that are in foreclo-
sure, and the recovered have completed foreclosure. In the system of dierential equations,
shown below, we also have two parameters, β and γ, that will need to be estimated. These
parameters represent the inverse of the Time Between Contacts (β−1
) and the inverse of the
Time Until Recovery (γ−1
).
Model I
dS
dt = −βIS
dI
dt = βIS − γI
dR
dt = γI
Another model, similar to the rst, also takes into account the birth rate (Λ) and death
rate (µ) of a population, represented here by originations and liquidations. Typically, this
model should be used when the arrival of new susceptible individuals is signicant on the
time-scale in consideration. Both of these models assume that the infected and recovered
individuals cannot move back to the susceptible compartment as in S → I → R. We saw
in Figure 2 that the birth and death rates are known, but vary over time.
Model II
dS
dt = Λ − µS − βIS
dI
dt = βIS − (γ + µ)I
dR
dt = γI − µR
The parameters will be estimated by the least squares method. Since there is not
a closed-form least squares solution, we will determine the estimates numerically, which is
equivalent. We will also examine the minimized Median Absolute Error (MAE), which can
8. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 7
yield dierent results. We will then pick a best model based on the lowest Mean Square Error
(MSE) relative to It. We will assume a population of 140,000 mortgages in Denver County
based on the number of homeowners in Denver and the proportion that have mortgages.[7]
9. 8 NATHANIEL BROWN
5. Results
The initial model (Model I) ignores originations and liquidations of mortgages. The
parameter estimation techniques resulted in ˆβ = 3.2 and ˆγ = 2.2 from both the lowest Mean
Square Error and lowest Median Absolute Error. This implies an infection period time in
foreclosure of 1
ˆγ
= 0.4545 or about 166 days. The Mean Square Error was 772,249 and
the Median Absolute Error was 390.9. It also tends to roughly follow the actual number of
foreclosures year over year as seen in Figure 3. This model has the advantage of simplicity,
though it does ignore the possibly signicant eects of originations and liquidations in the
mortgage population.
Figure 3. Results of the SIR model without originations and releases.
10. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 9
The following model (Model II) includes the originations and liquidations, representing
births and deaths in the typical epidemic model. The parameter estimation techniques
resulted in ˆβ = 4.4 and ˆγ = 3.5 from both the lowest Mean Square Error and lowest Median
Absolute Error. This implies an infection period of 1
ˆγ
= 0.2865 or about 105 days. The Mean
Square Error was 995,743 and the Median Absolute Error was 662.0. Using the MSE as a
comparison, this model is not nearly as accurate as the rst. The reason for this possibly
lies in the fact that, in recent years, there has been an increase in high quality mortgages
being made that are much less likely to go into foreclosure than the sub-prime ones that
were being originated in the years leading up to the crisis. This is evident in the increase in
the model's expected foreclosures in later years compared to the actual decrease, as seen in
Figure 4.
Figure 4. Results of the SIR model with originations and releases.
11. 10 NATHANIEL BROWN
We can also try least squares regression as a means to estimate the parameters of
Model I. The benet of this is that we can introduce some a measure of the uncertainty of
the estimates. To do this, we will need to transform the dierential equations to t discrete
time and perform ordinary linear regression with an intercept set at zero on
It+1 = β(StIt) + (1 − γ)It.
Figure 5. QQ
Plot of Residuals
Since this requires knowing the Susceptible values at each
time t and we do not have any way of calculating this
directly, St will be estimated for all t using the solution
from ˆβ = 3.2 and ˆγ = 2.2 from the lowest Mean Square
Error estimates above.
This method does require some assumptions to be
followed. Normality of the residuals was veried using a
QQ plot, which demonstrates that the residuals are normal
to a reasonable degree (Figure 5). Since we are working
with time-series data, we also need to be concerned with
serial dependence when performing regression. A Durbin-
Watson test with a p-value of 0.064 barely reports that we
should not be concerned with serial dependence at the 0.05 level.
The results of the regression model (Figure 6) were parameter estimates ˆβ = 2.9 and
ˆγ = 2.1. This method provides a measure of the uncertainty of the estimates, so we have
95% condence intervals of
β : (2.393686, 3.442656)
γ : (1.713751, 2.446380),
so both are signicantly dierent from zero at the 0.05 level. The Mean Square Error was
892,825 and the Median Absolute Error was 610.5, which places this method a little ahead of
the one used on Model II. It is no surprise that this method does not match the rst method
12. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 11
on Model I because that method minimized the square errors from It, whereas this method
minimized the square errors from It+1 − It.
Figure 6. Results of regression on the SIR model.
With a Mean Square Error of 772,249, the better model appears to be the SIR model
without originations or liquidations. The Median Absolute Error of 390.9 indicates that the
model predicted the number of foreclosures in half of the years within about 391 foreclosures.
Additionally, the γ value of 2.2 implies that the average time spent in foreclosure is 0.45 years
or about 165.9 days. This closely matches the average foreclosure timeline for Colorado of
177 days.[6]
The basic reproduction number of R0 = β
γ
= 3.2
2.2
= 1.45 is the expected number
of new foreclosures that results from one foreclosure when all mortgages are susceptible.
According to this model, we can expect the foreclosure rate to reach the base rate of 831 by
the year 2020, or within 20 foreclosures of that rate by the year 2017.
13. 12 NATHANIEL BROWN
6. Ideal Situation
Since detailed data on foreclosures is not generally made public, it is dicult to build
a clear picture of the foreclosure process. In many cases, loan-level data are condential
and loan servicers are hesitant to release even summary information that might shed a bad
light on their practices. Some data is released by government organizations and regulatory
agencies, such as the Denver County data we used, but this is also often sparse.
Though this data is not publicly available for the studied population, it is possible that
focusing on subprime mortgages would be more fruitful, since these are more likely to default
and go to foreclosure. As mentioned before, the increase in high-quality mortgages in recent
years eliminated using all originations and liquidations as a viable stand in for births and
deaths. Access to such data would be great for exploring more modeling options.
14. AN EPIDEMIC MODEL OF FORECLOSURES DURING THE SUBPRIME MORTGAGE CRISIS 13
7. Conclusion
This demonstrates that applying an epidemic model to an economic situation might
make sense under certain circumstances. Since foreclosures in a population can spread, we
were able to form a model that is typically used for the spread of disease. The implication
is that we might be able to model a sudden increase in foreclosures in the future. A basic
reproduction number (R0) less than one implies that the epidemic will die out. If R0 is
greater than one, we would be likely to see another mortgage crisis and might be able to
prepare for it. If another mortgage crisis triggers an increase in foreclosures, we would then
be able to estimate how long the crisis would last.
The next step in the process might be to explore this modeling procedure using other
populations, especially in states such as Florida that have a longer foreclosure timeline that is
often held up by courts. It would also be justied to model foreclosures under the assumption
that a mortgage might be brought out of foreclosure and be placed back into the susceptible
population before the foreclosure completes. The sub-prime mortgage crisis was foreseen by
few, but we can use the experience as a lesson in modeling the future.
15. 14 NATHANIEL BROWN
References
[1] Foreclosure Statistical Information. Denver Oce of the Clerk and Recorder, 31 Dec. 2014. Web. 28 Feb.
2015. https://www.denvergov.org/clerkandrecorder/DenverOceoftheClerkandRecorder/Foreclosures/
ForeclosureStats/tabid/445088/Default.aspx.
[2] Public Records. Denver Oce of the Clerk and Recorder. 29 Apr. 2015. Web. 2 May
2015. http://www.denvergov.org/clerkandrecorder/DenverOceoftheClerkandRecorder/tabid/445006/
Default.aspx.
[3] Simkovic, Michael, Competition and Crisis in Mortgage Securitization (October 8, 2011). Indiana Law
Journal, Vol. 88, p.213, (2013).
[4] Zandi, Mark (2010). Financial Shock. FT Press.
[5] CoreLogic Reports 57,000 Completed Foreclosures in September. CoreLogic, 31 Oct. 2012. Web. 26
Apr. 2015. http://multivu.prnewswire.com/mnr/corelogic/56990.
[6] Average Length of Foreclosure by State, by Number of Days. Baltimoresun.com. 10 Apr. 2014. Web.
2 May 2015. http://www.baltimoresun.com/news/data/bal-average-length-of-foreclosure-by-state-by-
number-of-days-20140924-htmlstory.html.
[7] United States Census Bureau. Denver County QuickFacts from the US Census Bureau. 31 Mar. 2015.
Web. 3 May 2015. http://quickfacts.census.gov/qfd/states/08/08031.html.
[8] Salmon, Felix. Recipe for Disaster: The Formula That Killed Wall Street. Wired.com. 23 Feb. 2009.
Web. 19 Oct. 2015. http://archive.wired.com/techbiz/it/magazine/17-03/wp_quant?currentPage=all.