The chain rule

Learning Objectives

  • What is a composite function and how do we recognize its structure algebraically?
  • Given a composite function C(x)=f(g(x))C(x) = f(g(x)) that is built from differentiable functions ff and gg, how do we compute C(x)C'(x) in terms of ff, gg, ff', and gg'? What is the statement of the Chain Rule?

Introduction

In addition to learning how to differentiate a variety of basic functions, we have also been developing our ability to use rules to differentiate certain algebraic combinations of them.

State the rule(s) required to find the derivative of each of the following combinations of f(x)=sin(x)f(x) = \sin(x) and g(x)=x2g(x) = x^2: s(x)=3x25sin(x)s(x) = 3x^2 - 5\sin(x), p(x)=x2sin(x),andp(x) = x^2 \sin(x), \text{and} q(x)=sin(x)x2q(x) = \frac{\sin(x)}{x^2}.

There is one more natural way to combine basic functions algebraically, and that is by composing them. For instance, let's consider the function C(x)=sin(x2)C(x) = \sin(x^2), and observe that any input xx passes through a chain of functions. In the process that defines the function C(x)C(x), xx is first squared, and then the sine of the result is taken. Using an arrow diagram, xx2sin(x2)x \longrightarrow x^2 \longrightarrow \sin(x^2).

In terms of the elementary functions ff and gg, we observe that xx is the input for the function gg, and the result is used as the input for ff. We write C(x)=f(g(x))=sin(x2)C(x) = f(g(x)) = \sin(x^2) and say that CC is the composition of ff and gg. We will refer to gg, the function that is first applied to xx, as the inner function, while ff, the function that is applied to the result, is the outer function.

Given a composite function C(x)=f(g(x))C(x) = f(g(x)) that is built from differentiable functions ff and gg, how do we compute C(x)C'(x) in terms of ff, gg, ff', and gg'? In the same way that the rate of change of a product of two functions, p(x)=f(x)g(x)p(x) = f(x) \cdot g(x), depends on the behavior of both ff and gg, it makes sense intuitively that the rate of change of a composite function C(x)=f(g(x))C(x) = f(g(x)) will also depend on some combination of ff and gg and their derivatives. The rule that describes how to compute CC' in terms of ff and gg and their derivatives is called the chain rule.

But before we can learn what the chain rule says and why it works, we first need to be comfortable decomposing composite functions so that we can correctly identify the inner and outer functions, as we did in the example above with C(x)=sin(x2)C(x) = \sin(x^2).

Preview Activity

For each function given below, identify its fundamental algebraic structure.

In particular, is the given function a sum, product, quotient, or composition of basic functions? If the function is a composition of basic functions, state a formula for the inner function gg and the outer function ff so that the overall composite function can be written in the form f(g(x))f(g(x)). If the function is a sum, product, or quotient of basic functions, use the appropriate rule to determine its derivative.

(a)

h(x)=tan(2x)h(x) = \tan(2^x)

(b)

p(x)=2xtan(x)p(x) = 2^x \tan(x)

(c)

r(x)=(tan(x))2r(x) = (\tan(x))^2

(d)

m(x)=etan(x)m(x) = e^{\tan(x)}

(e)

w(x)=x+tan(x)w(x) = \sqrt{x} + \tan(x)

(f)

z(x)=tan(x)z(x) = \sqrt{\tan(x)}

Exercise

Each of the functions on the left below could be described as having a specific algebraic structure as noted on the right. Match each function with its corresponding structure. It is possible that not all structures are used; and it is possible that not all functions have a description for its algebraic structure.

  1. g(x)=sin(x)exg(x) = \dfrac{\sin(x)}{e^x}
  2. r(x)=exsin(x)r(x) = e^x - \sin(x)
  3. s(x)=esin(x)s(x) = e^{\sin(x)}
  4. h(x)=sin(ex)h(x) = \sin(e^x)
  • the quotient of sin(x)\sin(x) and exe^x
  • the composition of exe^x and sin(x)\sin(x), where the sine function is evaluated at the exponential function
  • the difference of exe^x and sin(x)\sin(x)
  • the composition of exe^x and sin(x)\sin(x), where the exponential function is evaluated at the sine function
  • the product of exe^x and sin(x)\sin(x)
  • the sum of exe^x and sin(x)\sin(x)
Exercise

If you do not find a match in the previous exercise, explain why.

Exercise

You are riding a hot air balloon that is moving straight upward. You have access to an altimeter that tells you how high (in miles) above the ground the balloon is at a certain number of hours since the ride started [you can call it A(t)A(t)]. You also have access to a gauge that gives the air temperature (in degrees Fahrenheit) that you feel as a function of altitude [you can call it F(A)F(A)].

One hour into the ride, you happen to look at the screen and see the following: A(1)=2A'(1)=2 and F(2)=16F'(2)=-16. In your own words state: What does A(1)=2A'(1) = 2 tell you about your ride in the hot air balloon? Include units. What does F(2)=16F'(2) = -16 tell you about your ride in the hot air balloon? Include units. Explain in your own words how the air temperature you feel is changing with time, one hour into the ride, and find that rate.

The chain rule

Often a composite function cannot be written in an alternate algebraic form. For instance, the function C(x)=sin(x2)C(x) = \sin(x^2) cannot be expanded or otherwise rewritten, so it presents no alternate approaches to taking the derivative. But some composite functions can be expanded or simplified, and these provide a way to explore how the chain rule works.

Let f(x)=4x+7f(x) = -4x + 7 and g(x)=3x5g(x) = 3x - 5. Determine a formula for C(x)=f(g(x))C(x) = f(g(x)) and compute C(x)C'(x). How is CC' related to ff and gg and their derivatives?

It may seem that Example is too elementary to illustrate how to differentiate a composite function. Linear functions are the simplest of all functions, and composing linear functions yields another linear function. While this example does not illustrate the full complexity of a composition of nonlinear functions, at the same time we remember that any differentiable function is locally linear, and thus any function with a derivative behaves like a line when viewed up close. The fact that the derivatives of the linear functions ff and gg are multiplied to find the derivative of their composition turns out to be a key insight.

We now consider a composition involving a nonlinear function.

Let C(x)=sin(2x)C(x) = \sin(2x). Use the double angle identity to rewrite CC as a product of basic functions, and use the product rule to find CC'. Rewrite CC' in the simplest form possible.

In Example, if we let g(x)=2xg(x) = 2x and f(x)=sin(x)f(x) = \sin(x), we observe that C(x)=f(g(x))C(x) = f(g(x)). Now, g(x)=2g'(x) = 2 and f(x)=cos(x)f'(x) = \cos(x), so we can view the structure of C(x)C'(x) as C(x)=2cos(2x)=g(x)f(g(x))C'(x) = 2\cos(2x) = g'(x) f'(g(x)).

In this example, as in the example involving linear functions, we see that the derivative of the composite function C(x)=f(g(x))C(x) = f(g(x)) is found by multiplying the derivatives of ff and gg, but with ff' evaluated at g(x)g(x).

It makes sense intuitively that these two quantities are involved in the rate of change of a composite function: if we ask how fast CC is changing at a given xx value, it clearly matters how fast gg is changing at xx, as well as how fast ff is changing at the value of g(x)g(x). It turns out that this structure holds for all differentiable functions Like other differentiation rules, the Chain Rule can be proved formally using the limit definition of the derivative. as is stated in the Chain Rule.

The Chain Rule

If gg is differentiable at xx and ff is differentiable at g(x)g(x), then the composite function CC defined by C(x)=f(g(x))C(x) = f(g(x)) is differentiable at xx and C(x)=f(g(x))g(x)C'(x) = f'(g(x)) g'(x).

As with the product and quotient rules, it is often helpful to think verbally about what the chain rule says: If CC is a composite function defined by an outer function ff and an inner function gg, then CC' is given by the derivative of the outer function evaluated at the inner function, times the derivative of the inner function.

It is helpful to identify clearly the inner function gg and outer function ff, compute their derivatives individually, and then put all of the pieces together by the chain rule.

Determine the derivative of the function r(x)=(tan(x))2r(x) = (\tan(x))^2.

As a side note, we remark that r(x)r(x) is usually written as tan2(x)\tan^2(x). This is common notation for powers of trigonometric functions: cos4(x)\cos^4(x), sin5(x)\sin^5(x), and sec2(x)\sec^2(x) are all composite functions, with the outer function a power function and the inner function a trigonometric one.

Activity

For each function given below, identify an inner function gg and outer function ff to write the function in the form f(g(x))f(g(x)). Determine f(x)f'(x), g(x)g'(x), and f(g(x))f'(g(x)), and then apply the chain rule to determine the derivative of the given function.

(a)

h(x)=cos(x4)h(x) = \cos(x^4)

(b)

p(x)=tan(x)p(x) = \sqrt{ \tan(x) }

(c)

s(x)=2sin(x)s(x) = 2^{\sin(x)}

(d)

z(x)=cot5(x)z(x) = \cot^5(x)

(e)

m(x)=(sec(x)+ex)9m(x) = (\sec(x) + e^x)^9

Using multiple rules simultaneously

The chain rule now joins the sum, constant multiple, product, and quotient rules in our collection of techniques for finding the derivative of a function through understanding its algebraic structure and the basic functions that constitute it. It takes practice to get comfortable applying multiple rules to differentiate a single function, but using proper notation and taking a few extra steps will help.

Find a formula for the derivative of h(t)=3t2+2tsec4(t)h(t) = 3^{t^2 + 2t}\sec^4(t).

Activity

For each of the following functions, find the function's derivative. State the rule(s) you use, label relevant derivatives appropriately, and be sure to clearly identify your overall answer.

(a)

p(r)=4r6+2erp(r) = 4\sqrt{r^6 + 2e^r}

(b)

m(v)=sin(v2)cos(v3)m(v) = \sin(v^2) \cos(v^3)

(c)

h(y)=cos(10y)e4y+1h(y) = \frac{\cos(10y)}{e^{4y}+1}

(d)

s(z)=2z2sec(z)s(z) = 2^{z^2 \sec (z)}

(e)

c(x)=sin(ex2)c(x) = \sin(e^{x^2})

The chain rule now adds substantially to our ability to compute derivatives. Whether we are finding the equation of the tangent line to a curve, the instantaneous velocity of a moving particle, or the instantaneous rate of change of a certain quantity, if the function under consideration is a composition, the chain rule is often an essential tool.

Activity

Use known derivative rules, including the chain rule, as needed to respond to each of the following prompts.

(a)

Find an equation for the tangent line to the curve y=ex+3y= \sqrt{e^x + 3} at the point where x=0x=0.

(b)

If s(t)=1(t2+1)3\displaystyle s(t) = \frac{1}{(t^2+1)^3} represents the position function of a particle moving horizontally along an axis at time tt (where ss is measured in inches and tt in seconds), find the particle's instantaneous velocity at t=1t=1. Is the particle moving to the left or right at that instant?

(c)

At sea level, air pressure is 30 inches of mercury. At an altitude of hh feet above sea level, the air pressure, PP, in inches of mercury, is given by the function P=30e0.0000323hP = 30 e^{-0.0000323 h}. Compute dP/dhdP/dh and explain what this derivative function tells you about air pressure, including a discussion of the units on dP/dhdP/dh. In addition, determine how fast the air pressure is changing for a pilot of a small plane passing through an altitude of 10001000 feet.

(d)

Suppose that f(x)f(x) and g(x)g(x) are differentiable functions and that the following information about them is known:

xxf(x)f(x)f(x)f'(x)g(x)g(x)g(x)g'(x)
1-1225-53-344
223-3441-122

If C(x)C(x) is a function given by the formula f(g(x))f(g(x)), determine C(2)C'(2). In addition, if D(x)D(x) is the function f(f(x))f(f(x)), find D(1)D'(-1).

The composite version of basic function rules

As we gain more experience with differention, we will become more comfortable in simply writing down the derivative without taking multiple steps. This is particularly simple when the inner function is linear, since the derivative of a linear function is a constant.

For each of the following composite functions whose inside function is linear, find the overall function's derivative using the chain rule: f(x)=(5x+7)10f(x) = (5x+7)^{10}, g(x)=tan(17x)g(x) = \tan(17x), and h(x)=e3xh(x) = e^{-3x}.

More generally, we can think about how each basic function rule has a corresponding chain rule version. The next example demonstrates this for two familiar functions.

Develop a chain rule version of the two basic derivative rules that state ddx[sin(x)]=cos(x)\frac{d}{dx}[\sin(x)] = \cos(x) and ddx[ax]=axln(a)\frac{d}{dx}[a^x] = a^x \ln(a).

An excellent exercise for getting comfortable with the derivative rules is to complete Example for every basic function. That is, write down a list of all the basic functions whose derivatives you know, and list their corresponding derivatives. Then, corresponding to each basic rule, write a composite function with the inner function being an unknown function u(x)u(x) and the outer function being a basic function. Finally, write the chain rule for the composite function, such as ddx[sin(u(x))]=cos(u(x))u(x)\frac{d}{dx}[\sin(u(x))] = \cos(u(x)) \cdot u'(x).

Summary

A composite function is one where the input variable xx first passes through one function, and then the resulting output passes through another. For example, the function h(x)=2sin(x)h(x) = 2^{\sin(x)} is composite since xsin(x)2sin(x)x \longrightarrow \sin(x) \longrightarrow 2^{\sin(x)}. Given a composite function C(x)=f(g(x))C(x) = f(g(x)) where ff and gg are differentiable functions, the chain rule tells us that C(x)=f(g(x))g(x)C'(x) = f'(g(x)) g'(x).

Exercise

Consider the basic functions f(x)=x3f(x) = x^3 and g(x)=sin(x)g(x) = \sin(x). Let h(x)=f(g(x))h(x) = f(g(x)). Find the exact instantaneous rate of change of hh at the point where x=π4x = \frac{\pi}{4}. Which function is changing most rapidly at x=0.25x = 0.25: h(x)=f(g(x))h(x) = f(g(x)) or r(x)=g(f(x))r(x) = g(f(x))? Why? Let h(x)=f(g(x))h(x) = f(g(x)) and r(x)=g(f(x))r(x) = g(f(x)). Which of these functions has a derivative that is periodic? Why?

Exercise

Let u(x)u(x) be a differentiable function. For each of the following functions, determine the derivative. Each response will involve uu and/or uu'. p(x)=eu(x)p(x) = e^{u(x)} q(x)=u(ex)q(x) = u(e^x) r(x)=cot(u(x))r(x) = \cot(u(x)) s(x)=u(cot(x))s(x) = u(\cot(x)) a(x)=u(x4)a(x) = u(x^4) b(x)=(u(x))4b(x) = (u(x))^4

Exercise

Let functions pp and qq be the piecewise linear functions given by their respective graphs in Figure. Use the graphs to answer the following questions.

The graphs of p (in blue) and q (in green).
The graphs of pp (in blue) and qq (in green).

Let C(x)=p(q(x))C(x) = p(q(x)). Determine C(0)C'(0) and C(3)C'(3). Let Y(x)=q(q(x))Y(x) = q(q(x)) and Z(x)=q(p(x))Z(x) = q(p(x)). Determine Y(2)Y'(-2) and Z(0)Z'(0).

Exercise

If a spherical tank of radius 4 feet has hh feet of water present in the tank, then the volume of water in the tank is given by the formula V=π3h2(12h)V = \frac{\pi}{3} h^2(12-h). At what instantaneous rate is the volume of water in the tank changing with respect to the height of the water at the instant h=1h = 1? What are the units on this quantity? Now suppose that the height of water in the tank is being regulated by an inflow and outflow (e.g., a faucet and a drain) so that the height of the water at time tt is given by the rule h(t)=sin(πt)+1h(t) = \sin(\pi t) + 1, where tt is measured in hours (and hh is still measured in feet). At what rate is the height of the water changing with respect to time at the instant t=2t = 2? Continuing under the assumptions in (b), at what instantaneous rate is the volume of water in the tank changing with respect to time at the instant t=2t = 2? What are the main differences between the rates found in (a) and (c)? Include a discussion of the relevant units.