This tangent plane business relies on a cool trick that took me ages to understand just from the notes last year.
We know that the gradient vector is orthogonal to any level curve, so we take the function z=f(x,y) and put all the variables x,y,z on one side and then the constants on the other side.
We then have something resembling g(x,y,z)=c so we treat the 3d surface as a 'level curve' of some function in 3 variables (a 4d surface essentially). We then know that the gradient vector will be orthogonal to the level curve g(x,y,z)=c, but this function is exactly z=f(x,y), just rewritten. So the gradient vector of g is normal to the surface so the plane defined by that vector is tangent to the surface at any point.
Soz, pretty crappy explanation, I'm on my phone.