Chapter 2: Modeling Data

Section 2.5: Exponential and Logarithmic Models

This entire unit is about learning to fit models (equations) to different sets of raw data. Do not be confused by the word model. In mathematics, we often use the terms functions, equations, and models interchangeably, even though they have their own formal definition. The term model is typically used to indicate that the equation or function approximates a real-world situation.

Up until this point, we've fit either a linear or a quadratic model to the data we've been looking at. However, sometimes neither of those models are the best fit for the data in question. Occasionally, when looking at data that has a curve to it, an exponential model may be the best fit. This course has already looked at exponential models in Unit 1, with the compound interest formula. With an exponential function, the way that the data increases or decreases tells us a lot about the function. Knowing the behavior of an exponential function in general will allow us to recognize when to use a exponential model over a linear or quadratic model.

Exponential Function

An exponential function is an equation with base b and the variable $x$ is in the exponents place.

$$f(x)=c \cdot b^x$$

Here is a breakdown of what each of those letters does to the exponential function:

The letter b is your base and is a number that is greater than zero and not equal to one.
The initial value of the model is $y=c$, this is the value the function takes on when $x=0$. This is also known as the y-intercept.
If $b>1$, the function's model is an example of exponential growth. As $x$ increases, the outputs of the model, or $y$, increases. It will increase slowly at first, but as time progresses, the growth becomes more rapid without bounds. This means it increases forever.
If $0 < b < 1$, the function's model is an example of exponential decay. As $x$ increases, the outputs of the model, or $y$, decreases. It will decrease rapidly at first and then level off as it approaches the horizontal asymptote of the $x$-axis. This means that it gets extremely close to the $x$-axis, but never actually touches it or becomes less than zero (negative).

Before we get into applying this information, let's first look at how this will appear graphically.

Exponential Curve $y=0.5^x$

Figure 2.5.1:Graph of $y=0.5^x$

Exponential Curve $y=1.5^x$

Figure 2.5.2:Graph of $y=1.5^x$

Note: In the exponential decay graph, $b=0.5$, and it can be seen that the graph is decreasing rapidly at first then it levels off. In the exponential growth graph, $b=1.5$, and it can be seen that the graph is increasing slowly at first and then rapidly increases.

Now that we see how the $b$ value can impact the graph, let's look at how the $c$ value can impact the graph.

Exponential Curve $y=2(0.5)^x$

Figure 2.5.3:Graph of $y=2(0.5)^x$

Exponential Curve $y=0.5(1.5)^x$

Figure 2.5.4:Graph of $y=0.5(1.5)^x$

Note: That in the decreasing function $c=2$, which appears to make the graph 'decay' faster. While with the increasing function $c=0.5$,which appears to make the graph 'grow' slower.

This allows us to make some generalizations about what $c$ does to our graphs.

If $c < 1$, then the growth/decay of the function slows down or becomes less rapid.
If $c > 1$, then the growth/decay of the function speeds up or becomes more rapid.

Just as a note, all exponential functions have horizontal asymptotes, which is an imaginary line that a graph of a function will get infinitely close to but will never actually touch. In the cases we are looking at, it will get infinitely close to $x=0$ but won't ever actually touch it.

It is also important to mention the importance of the number "$e$". The number "$e$" is an irrational (cannot be expressed as a fraction) number similar to $\pi$. According to Mathnasium, Euler's number "$e$" has over 1 trillion digits of accuracy, but they still have not identified all of the digits. Currently, we approximate $e \approx 2.718281828$. We just use the letter "$e$" to represent that number to save us from writing all of those decimal places.

2.5.1 Making Exponential Models

To make an exponential model, we'd employ a number of the practices that we've learned earlier in this unit. Recall that when you're trying to find a model that best fits the data, you find the correlation coefficients using some form of technology. Once you've identified the correlation coefficient for the linear, quadratic, and exponential models, you then determine which model is the strongest based off of that number. When talking about regression analysis, if two correlation coefficients are approximately equal, you take the simpler of the two models.

Example 1

In $2007$, a university study was published investigating the crash risk of alcohol impaired driving. Data from $2,781$ crashes were used to measure the association of a person's blood alcohol level (BAC) with the risk of being in an accident. The results are shown in the table below. The relative risk is a measure of how many times more likely a person is to crash. For example, a person with a BAC of $0.09$ is $3.54$ times as likely to crash as a person who has not been drinking alcohol.

BAC	$0$	$0.01$	$0.03$	$0.05$	$0.07$	$0.09$	$0.11$	$0.13$	$0.15$	$0.17$	$0.19$	$0.21$
Relative Risk	$1$	$1.03$	$1.06$	$1.38$	$2.09$	$3.54$	$6.41$	$12.6$	$22.1$	$39.05$	$65.32$	$99.78$

Let $x$ represent the BAC level, and let $y$ represent the corresponding relative risk. Fit a linear, quadratic, and exponential model to the data. Include the correlation coefficient "r".
Based off the three models fitted to the data, identify which is the best fit and why.
After 6 drinks, a person weighing 160 pounds will have a BAC of about 0.16. How many times more likely is a person with this weight to crash after drinking 6 drinks? Round your answer to the nearest hundredths.

Solution

The image below shows a graph that was constructed using Excel.

Image 2.5.1: Regression Ran on Data for BAC.
In this graph, the green line represents the linear model, the red represents the quadratic model, and the blue represents the exponential model. The graph displays the equation, coefficient of determination $R^2$, and the graph. Using the coefficients of determination, we need to identify each correlation coefficient "$r$".
Linear: $r_l=\sqrt{R^2}=\sqrt{0.6912}=0.8314$

Quadratic: $r_q=\sqrt{R^2}=\sqrt{0.9634}=0.9815$

Exponential: $r_e=\sqrt{R^2}=\sqrt{0.9978}=0.9989$

Looking at those numbers, it can be determined that since $r_e$ is the largest correlation coefficient, the exponential model of $\hat{y}=0.583e^{25.818x}$ is the best option for this set of data.
Now for the second part of this question, we need to take the model that was selected and substitute in the value $x=0.16$ for $x$ in the equation.
$$\begin{align*} \hat{y}&=0.583e^{25.818 \cdot 0.16}\\ \hat{y}&=0.583e^{4.13088}\\ \hat{y}&=0.583(62.23266)\\ \hat{y}&=36.28 \end{align*}$$

This means that a $160$ pound individual who has had 6 drinks and has a BAC of $0.16$ is $36.28$ times more likely to have an accident than a person who has had no drinks and a BAC of $0$.

2.5.2 Building Logarithmic Models

As you have probably seen, not all models are suitable for a linear, quadratic, or even an exponential function. The list of function types that we can fit to data is larger than just those three. In an exponential function, growth starts slowly and then as time progresses it starts to get more rapid. But what if the change was more rapid initially, and over time the growth then slowed down but never approached an asymptote? This type of data would most likely fit a Logarithmic Function. A logarithmic function is another function family that can be used to model data.

A logarithmic function can take on a number of characteristics. One of those characteristics is the base of the function. The Arabic number system (our number system) works in a base ten system. Our digits are $0,1,2,3,4,5,6,7,8,9$ and every digit after that is just a combination of those ten digits. Most computer systems work in a base two system of binary. Those number systems are the digits $0,1$ and every digit after that is just a combination of those two digits. At this point, you've probably realized that there are multiple types of number systems.

Logarithmic Function

A logarithmic function is an equation with base b

$$f(x)=a \cdot \log_b(x)$$

Graphically, the function would look similar to this:

Logarithmic Curve $f(x)=log_{10} (x)$

Figure 2.5.3:Graph of $f(x)=log_{10}(x)$

Logarithmic Curve $f(x)=ln(-x)$

Figure 2.5.4:Graph of $f(x)=ln(-x)$

It can be seen that the graph grows quickly, then over time, the growth slows down. The reverse of that can be seen in the right graph. Also notice how in the right graph, the function is $f(x)=\ln(-x)$. The $\ln$ stands for the natural log and is the logarithmic function base "$e$" ($\log_e(x)=\ln(x)$). The basic logarithmic function that is most typically used for modeling is $y=a+b\ln(x)$. For this function:

All input values, $x$ must be greater than zero.
The point $(1,a)$ is on the graph of the model.
If $b>0$, the model is increasing, growth increases rapidly at first and then slows down as time progresses.
If $b<0$, the model is decreasing, decay occurs rapidly at first and then slows down as time progresses.

Example 2

Due to advances in medicine and higher standards of living, life expectancy has been increasing in most developed countries since the beginning of the $20^{th}$ century. The table below shows the average life expectancies, in years, of Americans from $1900-2010$.

Year	$1900$	$1910$	$1920$	$1930$	$1940$	$1950$	$1960$	$1970$	$1980$	$1990$	$2000$	$2010$
Life Expectancy	$47.3$	$50.0$	$54.1$	$59.7$	$62.9$	$68.2$	$69.7$	$70.8$	$73.7$	$75.4$	$76.8$	$78.7$

Let $x$ represent the time in decades starting with $x=1$ for the year $1900$, $x=2$ for the year $1910$, and so on. Let $y$ represent the corresponding life expectancy. Fit a logarithmic regression model using technology.
Use the model to predict the average American life expectancy for the year 2030.

Solution

The following graph was constructed using Microsoft Excel and fitting a logarithmic trend-line.

excel display of life expectency values as an output of decades since 1900, with a logarithmic model fit to the data, displaying bothe the y-hat equation and the rquare value.

Image 2.5.2: Regression Ran on Data for a logarithmic model.

From this graph, we can see that the correlation coefficient (r) can be found using the coefficient of determination ($R^2$).
Logarithmic: $\sqrt{R^2}=\sqrt{0.9563}=0.9779$

This suggests that there is a strong positive correlation. Additionally, the model that best fits the data is given by $\hat{y}=42.527+13.858\ln(x)$.
Next, we use our model to predict the life expectancy in the year 2030. We'd find how many decades 2030 is from 1900. Keep in mind there are 10 years in 1 decade.
$$2030-1900=130 \div 10 = 13 \text{ decades}$$

By substituting $x=13$ into our equation, we get:

$$\begin{align*} \hat{y}&=42.527+13.858\ln(13)\\ \hat{y}&=42.527+13.858(2.564949)\\ \hat{y}&=42.527+35.545068\\ \hat{y}&=78.07 \end{align*}$$

This means that in the year 2030, which is 13 decades after 1900, the average American can expect to live to be 78 years old.

2.5.3 Exercises

Coming soon: Additional exercises for exponential and logarithmic models will be added here.