This entire unit is about learning to fit models (equations) to different sets of raw data. Do not be confused by the word model. In mathematics, we often use the terms functions, equations, and models interchangeably, even though they have their own formal definition. The term model is typically used to indicate that the equation or function approximates a real-world situation.
Up until this point, we've fit either a linear or a quadratic model to the data we've been looking at. However, sometimes neither of those models are the best fit for the data in question. Occasionally, when looking at data that has a curve to it, an exponential model may be the best fit. This course has already looked at exponential models in Unit 1, with the compound interest formula. With an exponential function, the way that the data increases or decreases tells us a lot about the function. Knowing the behavior of an exponential function in general will allow us to recognize when to use a exponential model over a linear or quadratic model.
An exponential function is an equation with base b and the variable $x$ is in the exponents place.
$$f(x)=c \cdot b^x$$
Here is a breakdown of what each of those letters does to the exponential function:
Before we get into applying this information, let's first look at how this will appear graphically.
Exponential Curve $y=0.5^x$
Figure 2.5.1:Graph of $y=0.5^x$
Exponential Curve $y=1.5^x$
Figure 2.5.2:Graph of $y=1.5^x$
Note: In the exponential decay graph, $b=0.5$, and it can be seen that the graph is decreasing rapidly at first then it levels off. In the exponential growth graph, $b=1.5$, and it can be seen that the graph is increasing slowly at first and then rapidly increases.
Now that we see how the $b$ value can impact the graph, let's look at how the $c$ value can impact the graph.
Exponential Curve $y=2(0.5)^x$
Figure 2.5.3:Graph of $y=2(0.5)^x$
Exponential Curve $y=0.5(1.5)^x$
Figure 2.5.4:Graph of $y=0.5(1.5)^x$
Note: That in the decreasing function $c=2$, which appears to make the graph 'decay' faster. While with the increasing function $c=0.5$,which appears to make the graph 'grow' slower.
This allows us to make some generalizations about what $c$ does to our graphs.
Just as a note, all exponential functions have horizontal asymptotes, which is an imaginary line that a graph of a function will get infinitely close to but will never actually touch. In the cases we are looking at, it will get infinitely close to $x=0$ but won't ever actually touch it.
It is also important to mention the importance of the number "$e$". The number "$e$" is an irrational (cannot be expressed as a fraction) number similar to $\pi$. According to Mathnasium, Euler's number "$e$" has over 1 trillion digits of accuracy, but they still have not identified all of the digits. Currently, we approximate $e \approx 2.718281828$. We just use the letter "$e$" to represent that number to save us from writing all of those decimal places.
To make an exponential model, we'd employ a number of the practices that we've learned earlier in this unit. Recall that when you're trying to find a model that best fits the data, you find the correlation coefficients using some form of technology. Once you've identified the correlation coefficient for the linear, quadratic, and exponential models, you then determine which model is the strongest based off of that number. When talking about regression analysis, if two correlation coefficients are approximately equal, you take the simpler of the two models.
In $2007$, a university study was published investigating the crash risk of alcohol impaired driving. Data from $2,781$ crashes were used to measure the association of a person's blood alcohol level (BAC) with the risk of being in an accident. The results are shown in the table below. The relative risk is a measure of how many times more likely a person is to crash. For example, a person with a BAC of $0.09$ is $3.54$ times as likely to crash as a person who has not been drinking alcohol.
| BAC | $0$ | $0.01$ | $0.03$ | $0.05$ | $0.07$ | $0.09$ | $0.11$ | $0.13$ | $0.15$ | $0.17$ | $0.19$ | $0.21$ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Relative Risk | $1$ | $1.03$ | $1.06$ | $1.38$ | $2.09$ | $3.54$ | $6.41$ | $12.6$ | $22.1$ | $39.05$ | $65.32$ | $99.78$ |
Image 2.5.1: Regression Ran on Data for BAC.
Linear: $r_l=\sqrt{R^2}=\sqrt{0.6912}=0.8314$
Quadratic: $r_q=\sqrt{R^2}=\sqrt{0.9634}=0.9815$
Exponential: $r_e=\sqrt{R^2}=\sqrt{0.9978}=0.9989$
Looking at those numbers, it can be determined that since $r_e$ is the largest correlation coefficient, the exponential model of $\hat{y}=0.583e^{25.818x}$ is the best option for this set of data.
$$\begin{align*} \hat{y}&=0.583e^{25.818 \cdot 0.16}\\ \hat{y}&=0.583e^{4.13088}\\ \hat{y}&=0.583(62.23266)\\ \hat{y}&=36.28 \end{align*}$$
This means that a $160$ pound individual who has had 6 drinks and has a BAC of $0.16$ is $36.28$ times more likely to have an accident than a person who has had no drinks and a BAC of $0$.
As you have probably seen, not all models are suitable for a linear, quadratic, or even an exponential function. The list of function types that we can fit to data is larger than just those three. In an exponential function, growth starts slowly and then as time progresses it starts to get more rapid. But what if the change was more rapid initially, and over time the growth then slowed down but never approached an asymptote? This type of data would most likely fit a Logarithmic Function. A logarithmic function is another function family that can be used to model data.
A logarithmic function can take on a number of characteristics. One of those characteristics is the base of the function. The Arabic number system (our number system) works in a base ten system. Our digits are $0,1,2,3,4,5,6,7,8,9$ and every digit after that is just a combination of those ten digits. Most computer systems work in a base two system of binary. Those number systems are the digits $0,1$ and every digit after that is just a combination of those two digits. At this point, you've probably realized that there are multiple types of number systems.
A logarithmic function is an equation with base b
$$f(x)=a \cdot \log_b(x)$$
Graphically, the function would look similar to this:
Logarithmic Curve $f(x)=log_{10} (x)$
Figure 2.5.3:Graph of $f(x)=log_{10}(x)$
Logarithmic Curve $f(x)=ln(-x)$
Figure 2.5.4:Graph of $f(x)=ln(-x)$
It can be seen that the graph grows quickly, then over time, the growth slows down. The reverse of that can be seen in the right graph. Also notice how in the right graph, the function is $f(x)=\ln(-x)$. The $\ln$ stands for the natural log and is the logarithmic function base "$e$" ($\log_e(x)=\ln(x)$). The basic logarithmic function that is most typically used for modeling is $y=a+b\ln(x)$. For this function:
Due to advances in medicine and higher standards of living, life expectancy has been increasing in most developed countries since the beginning of the $20^{th}$ century. The table below shows the average life expectancies, in years, of Americans from $1900-2010$.
| Year | $1900$ | $1910$ | $1920$ | $1930$ | $1940$ | $1950$ | $1960$ | $1970$ | $1980$ | $1990$ | $2000$ | $2010$ |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Life Expectancy | $47.3$ | $50.0$ | $54.1$ | $59.7$ | $62.9$ | $68.2$ | $69.7$ | $70.8$ | $73.7$ | $75.4$ | $76.8$ | $78.7$ |
The following graph was constructed using Microsoft Excel and fitting a logarithmic trend-line.
Image 2.5.2: Regression Ran on Data for a logarithmic model.
Logarithmic: $\sqrt{R^2}=\sqrt{0.9563}=0.9779$
This suggests that there is a strong positive correlation. Additionally, the model that best fits the data is given by $\hat{y}=42.527+13.858\ln(x)$.
$$2030-1900=130 \div 10 = 13 \text{ decades}$$
By substituting $x=13$ into our equation, we get:
$$\begin{align*} \hat{y}&=42.527+13.858\ln(13)\\ \hat{y}&=42.527+13.858(2.564949)\\ \hat{y}&=42.527+35.545068\\ \hat{y}&=78.07 \end{align*}$$
This means that in the year 2030, which is 13 decades after 1900, the average American can expect to live to be 78 years old.
Coming soon: Additional exercises for exponential and logarithmic models will be added here.