So if you get bored of Spivak, try Abbotts' or Pugh's book. Both do a good job of motivating the rigor, but apparently Abbott is better suited for someone who has not done proofs before
.
Thank zzdfa, I'll might investigate those books 
And also addikaye03, Yeah seems like it could work for those functions too, I'll give it a go
Another question, for the dirac delta function,
and dx=1)
The derivative is meant to be defined as:
)= - \delta (x))
Just wondering, does this make sense mathematically? The above expression can be derived with integration by parts, but the function isn't even continuous :/
Oh I didn't notice this. Yeah, there are ways to make sense of this mathematically, as Ahmad said in the sense of distributions. It comes in handy a lot in the area of Fourier analysis (which I know a bit about), where distributions pop up naturally as Fourier transforms of functions that are locally integrable but not globally integrable (e.g.

makes sense over any finite interval, but not over an infinite interval). Anyway, the basic idea is to look at distributions in the sense of their effect on other suitably nice functions, that is, to look at
f(x) \: dx})
for "nice" functions

. If you're lucky, you can then extend the result to lots more functions

, and you can then deal with distributions in certain situations without anything bad happening. Of course, mathematicians are always careful to only use them when they know that the theory behind them is consistent, whereas engineers and physicists will just blast away...

Ah thanks humph XD
Just wondering, in spherical coordinates, what is considered 'more' standard:
Radial distance: r
Azimuth angle: 
Zenith angle: 
or
Radial distance: r
Azimuth angle: 
Zenith angle: 
If not, are any of them preferred in Australian universities?
Stewart
and Griffiths
use different notation so it's really confusing xD
I kind of prefer the first one since
is the azimuth angle in polar coordinates anyway. I don't see the logic in assigning to it a different role
Wikipedia and hyperphysics seem to have it in
form too 
But wolfram has it in
form 
Hmmm I'm not sure what is more standard. I guess I'd use the first one for the same reason as you, but to be honest it doesn't really matter too much, and bear in mind that you'll find plenty of cases where you'll use different coordinate systems but with familiar labels, which can be quite confusing (see e.g.
this thread).