Coursera Machine learning course starts Monday, 17 Oct


I’m going to do the Coursera Machine Learning course that starts tomorrow (Monday, 17 Oct.). My experience of machine learning is limited to university AI a couple of decades ago (I presume it’s come on a bit since then), along with some computer vision stuff involving Fourier transforms and suchlike, but that’s about it.

It’d be more fun to do this other folk, so if anyone’s interested, let me know so that we can share ideas and experiences.

1 Like

I’ve signed up for “The Andrew Ng” course three times, I think, and never had time commit to it – which is totally my fault, as I know others who have made it work. I’m staring down the barrel of a nightmare quarter of client work, but… am tempted to give it another swing. However, I can’t easily see how many hours a week they’re saying it needs - was it 8-10?

Week one looks like 3 hours of videos + some test/quizes (no prog’ing). After that, it looks like about one hour (and a bit) of video + quiz + prog’ing assignment, which they estimate at 3 hours. So, looks like about 4 hours/week min.

1 Like

That’s basically 2x what I usually have available as ‘me time’ these days, but will give it a shot.

Also: how did my life end up like this? :wink:

Have you ponied up the cash for the certified version?

You had kids. And that trumps this, as if I need to tell you.

No, I’m free-loading. I have enough certificates that no-one is interested in :slight_smile:

1 Like

Yeah, family > everything, for sure – am just gently lamenting a work ethic that always starts out as a 4d/week arrangement and slides into 5+d :slight_smile:

Interestingly, on Chrome, Coursera class pages seem to eat all my CPU. Fine in Safari.

This will come in handy for any OSXers who also give the course a swing:

After a couple of hours of faffing with the homebrew version of Octave to display the gui – I could get it to display plots – I went for the 4.0.3 binary linked to here.

Hard not to have fun with his stuff:


When I was contracting at Jaguar, I sat near the IBM mainframe performance team. They produced plots just like this to decide how to put the busy bytes on the faster parts of the disk. Dark arts. I can’t remember why they needed 4 dimensions. Does this, or is it just using colour to emphasise the ‘vertical’ co-ordinate, to increase the 3-D effect?

Seriously though, I’d heard there was a Free Matlab and forgotten about it. That might be useful to me. I helped move Matlab off a Windows cluster onto a multi-core Linux box a few years ago, just to save on licensing costs. I was doing it for a guy who used to work at the Met Office, so he could build climate models. Oddly, I want to model octaves. I just need to write some Clojure first :-/

I’m suddenly struck by the cruelty of being forced choose between studying either AI or non-linear programming in my computer science course. It also meant I had to write some COBOL. It doesn’t count because I didn’t compile.

So, against the odds, am definitely into this. Is anyone else, or it it just me and @auxbuss?

So, week one down, ten to go – although there’s an optional module on revising linear algebra that I’ll muddle through over the weekend.

So far, we’ve covered supervised (regression and classification) and unsupervised learning; linear regression as applied to the cost function; and we’re approaching implementing it with the gradient descent method. Fancy lingo, huh?

Basically, all that means is that we’ll be able to find a best-fit line for a bunch of data points on a graph – albeit in three dimension (for starters, yo). I get the impression we’re learning the equivalent of “I am a penguin” on Duolingo. Why soy and and not estoy? Dunno. What’s a gato?

Cons: Some of the wordy definitions are a bit woolly. I think this is just the nature of the medium. Usually, when studying, you’d be boning up the background in some dense text book. This doesn’t detract from things, though (see pros) – though it does lead to some head-scratching if you do the tests.

Pros: A lot of effort is made to describe ways in which to understand the mathematics intuitively. This is very refreshing. I think most folk, when they see partial differential equations, mentally find themselves at the top of roller coaster ride, but Mr Ng disarms this nicely by basically pointing out that it’s just a slope, dude. Look! No hands! Waaaah!

So, basically a week of learning the essential vocab. Fours stars: wordy descriptions could be improved.

1 Like

Interesting! I was looking at an A Level maths text book in Waterstones, with the idea of revising calculus. “Woolly” is exactly what I would have called it. It ‘showed you one’ but didn’t explain things properly. I’ve also looked at an online guide and there was an error on the first page that I only knew because I’d done it before. I couldn’t report the error and didn’t trust it after that. I didn’t believe that standards had dropped but I think they’ve changed.

I came here to post that Andrew Ng’s teaching style, particularly around de-mathsing the maths, is really working for me. (I have a A-level Pure and Stats background, but it’s pretty rusty)

Loving this.

It’s strange how calculus is taught in a way that makes it so scary. After all, it’s only about slopes and summations, but it’s all a bit magical unless the how and why are explained.

The old books seems to do a better job. The classic being Thompson’s, Calculus Made Easy. It’s out of copyright and can be had from, say, Project Gutenberg. Talk of “mathematical men” and schoolboys might irk modern sensibilities, though.

Whenever my nukes lecturer was about to write up some horrendous maths, he’d say “and as every schoolboy knows”.

I thought I understood at the time but I’ve forgotten and I think I was taught a trick to differentiate, that I didn’t really understand.

calculus made easy

Take y = x^2. If x grows, x^2 grows. And if x^2 grows, then y grows. So, let x grow a little bit bigger and become x + dx. It follows that y will grow a bit bigger and become y + dy.

So, we have:

y + dy = (x + dx)^2

y + dy = x2 + 2x·dx + (dx)^2

But y = x^2 and (dx)^2 is effectively zero, so:

dy = 2x·dx 


dy/dx = 2x

And we can prove, in general, that for:

y = x^n


dy/dx = nx^(n−1)

That’s half of calculus in a nutshell. Of course, the formulae get more complicated, but we can use computers to do the mechanical solving these days.

The thing that gets left out. The thing that really matters, imo, is what dy/dx means.

When dy/dx = 0, then it’s a horizontal line. It’s a local minima. It’s a change of behaviour (when applied to physical systems). And it’s something that can be calculated for. dy/dx is interesting. Pretty much always.

Everyone knows about dark matter these days. The way we solve for dark matter is based on the flat (dy/dx = 0) rotational velocity graph of galaxies

When dy/dx = 1, then the x and y are growing 1:1. It’s the tipping point.

When dy/dx starts getting big, then things are starting to run away. Like the GBP after Brexit.

Another way to think of calculus is from a system’s behaviour and how its function might look to create the observed behaviour.

I think it’s pretty cool that Newton’s ideas are being used to further our understanding of machine learning. I wonder what he’d make of it.

Thanks but I think that’s my problem: “And we can prove, in general, that for…”. Noone ever did prove it. I was just given it as a rule to follow, along with ‘look at an integral and it will be obvious what it was differentiated from’ as my method of integrating and “that will be good enough for A level”, as though there was more that I wasn’t being told. I learned but I didn’t understand, leaving me feeling that I was using a dodgy heuristic.

I probably didn’t care at the time that I hadn’t seen a proof but physics education makes you very nervous about accepting things you haven’t really understood, even after you’ve forgotten most of it :grinning:

Your explanation of local minima was better than Edexcel’s. I think they described it as “maximum rate of change” which was true for the one function they showed but I could see wouldn’t be for a function with ‘a knee’. I don’t think you should ever lie to children.

Proudly sponsored by Bytemark