i'm gonna make this up on the spot so mea culpa if it's messy but I find Taylor expansions really useful in these sorts of situations
if we start with function f( g(x))
and by definition we have
(df)/(dx) = lim_(h to 0) (f(g(x+h)) - f(g(x)))/(h)
Using Taylor, we're gonna expand the first bit as follows:
g(x+h) = g(x) + h g'(x) + O(h^2)
So to clarify, we have
(df)/(dx) = lim_(h to 0) (f color(red)(( g(x) + h g'(x) + O(h^2))) - f(g(x)))/(h)
now to simplify a little we set: eta(x) = g'(x) + O(h)
(df)/(dx) = lim_(h to 0) (f ( g(x) + h eta (x) ) - f(g(x)))/(h) qquad square
And now we're gonna expand f ( g(x) + eta (x) ) by the same process
f ( g(x) + h eta (x) ) = f(g(x)) + h eta (x) f'(g(x)) + O(h^2)
= f(g(x)) + h ( g'(x) + O(h)) f'(g(x)) + O(h^2)
= f(g(x)) + h f'(g(x)) g'(x) + O(h^2)
We can put that in square
(df)/(dx) = lim_(h to 0) ( f(g(x)) + h f'(g(x)) g'(x) + O(h^2)- f(g(x)))/(h)
(df)/(dx) = lim_(h to 0) ( h f'(g(x)) g'(x) + O(h^2))/(h)
= lim_(h to 0) f'(g(x)) g'(x) + O(h)
= f'(g(x)) g'(x)