On Learning Some Math
In late May, I left my startup job to study math.
I had been working at the startup for the past eighth months, having started in mid-September of the year before. I was one of their two backend engineers, teaming up with a remote developer in China to build out the backend for the company’s product, an iPhone game. They hired me because I aced their coding test, and I ended up being directly responsible for the majority of the company’s client-serving backend. It was demanding, it pushed me, I liked it.
I was also starting out as a part-time student in Columbia’s QMSS program, in their data science concentration. It was supposed to be a challenging track – I was warned that most students drop out. It sounded exciting. I knew my math background was relatively weak – one semester of calculus, plus discrete math. I had been fully out of school for two years at that point, so I thought I’d take it slowly and sign up for two classes. Probability and Statistics. Social Science Theory and Methods. Plus the new startup job, and the non-profit I help run. Nothing could go wrong with this plan.
I am occasionally naive.
The year was rough. 80-hour-weeks-every-week rough. I drop into a calculus-based statistics course without really knowing that much calculus. I’m pretty sure that the first time I had to actually solve an integral, it was a double. Things are coming at us pretty fast. The midterm average is less than 40%. I contemplate the aesthetics of the letter “F”. I hustle and struggle and put in the time. Random variables, probability distributions. Independence, expectation. Samples, confidence, hypotheses. I get good at breaking big problems into smaller ones. Things come together. Everything starts to make sense. I get an A. It was encouraging.
Thanks, Professor Cunningham.
I didn’t like the Theories and Methods course. Mostly because the professor thought he was too important to bother preparing his lectures. I match his effort. Get an A-. By a point. I decide that’s fair. Social science is stepping into the background for me anyway.
I sign up for Data Visualization in the Spring. Only one class, plus a casual once-weekly seminar. I am hoping for a slightly more relaxed semester. Went in to data visualization the first day and immediately realized that class was not worth the money. I only had a few semesters of grad school, and I wasn’t going to be wasting time. I transfer into Machine Learning. It seems the semester will be slightly less relaxed.
My company is geting tired of my academic activities. I didn’t particularly think it was their business what I did after hours. I was exchanging ~50 hours a week of diligent labor for an 80k salary, which I thought was actually pretty fair. I am pulling a lot of weight. They do not invest in my professional development. They are not interested in my thoughts on the product. They suggest that I start coming in on weekends. I know I have the smallest equity stake out of anyone there. There is no opportunity for growth. They are not making a strong case. I am wondering what lessons I should be taking away from this. I contemplate the increasing presence of Millenials in the workforce.
I missed the first ML lecture, due to aforementioned schedule switch. I walk in fifteen minutes late to the second lecture, having gotten lost trying to find the room. I take a seat near the back, finding a spot next to a rebellious-looking kid straight out of hipster Brooklyn. I get out my notebook (an extra-large unlined Moleskine, my signature), and a pencil.
I look at the board and the first thing I see is some sort of upside-down triangle. The professor is talking about gradients and projections. I have no idea what is going on. For a few minutes I strongly consider walking out.
I stick it out. I basically understand what is going on. The professor is very good. I can visualize shapes in three-dimensional space. I know how to square things. I calm down. The kid next to me is funny and interesting.
I end up really liking Machine Learning. It is like being in wizard school. There are many interesting ways to turn data into different data that is somewhat more actionable. I use special symbols to make knowledge appear out of thin air. Hyperplanes. Prediction. Kernels. Boosting. I am very pleased with the whole thing.
I get another A. I am heartened. Thanks, Paisley.
I am starting to think that this math and computer thing is important and that I should start doing more of it. My company is reaching the same conclusion. We part ways in May, a calm and appropriate separation. I had managed to cover most of the year’s tuition as well as replenish most of my savings, so I was feeling comfortable financially. I easily had five months rent just lying around. I feel free.
I decide it’s time to learn math properly. Hustling through two semesters of grad school was fine, but I want to actually be good at this. I needed to know what things were, and how they worked. I wanted to learn the math that I could have learned in college, if I had been a bit more mature. I thought back to the Berkeley undergraduate math sequence, specifically the lower-division requirements. 1A: Differential Calculus. 1B: Integral Calculus. 53: Multivariable Calculus. 54: Linear Algebra and Differential Equations, 55: Discrete Math.
I decide that I want all of it. I had taken 1A as a freshman, driven by a certain exploratory impulse that was later attenuated by the pre-law GPA-protecting pragmatism. I took Math 55 as a senior, as one of the “hard classes” for my Cognitive Science major. Up until this year, I had had a fear of math as something which I would not easily be good at. I had been lazy about math in high school; I hadn’t yet seen the point.
I get to it. I plan to go to school full-time in the September. It is late May. I have about three months. I need to build a foundation. I am not interested in paying for undergraduate classes. A formal course would also be far too slow. I’ve heard of the internet. I’m motivated.
I start at Khan Academy. Their automated assessment politely suggests I review some precalculus. I am disheartened but resolved. I had copied my friend Stephen’s math homework all throughout 11th grade. Khan Academy is correct.
Salman Khan walks me from the Unit Sphere all the way through Integral Calculus. Triangles, sinusoids, Taylor series, implicit derivation, integration by parts, complex numbers, the works. I am embarassed at first – I am learning material being taught to high schoolers ten years my junior. But this is the path. I take every test, solve every problem that comes my way. (I admit to skipping the test for L’Hospital’s rule).
The whole thing takes about a month. I finish everything through single-variable calculus in late June. I decide to start multivariable. I begin looking at Khan Academy’s offerings, but find them underdeveloped. It is time for something more traditional. Enter MIT OpenCourseWare.
MIT foresaw the MOOC revolution at least ten years early, having started putting lectures and course materials online since 1999. Browsing the web for multivariable calculus courses, I find a series of videos of the entire set of lectures of MIT 18.02, Multivariable Calculus, taught in Fall 2007 by Denis Auroux. This is right.
I am able to work through 2-3 lectures per day. A 50-minute lecture takes me about 2-3 hours to digest. I do not skip days, although on weekends I go a bit easy. I am very happy about multivariable calculus. Vectors make sense, as do their gradients. Level sets. Divergence. Flux. Curl. I get used to drawing in three dimensions. I do a lot of integrals. All of a sudden there are theorems. I draw a lot of shapes. I like the way the pencil taps against the notebook when I am solving equations quickly. I develop certain writing flourishes. Denis Auroux is a funny lecturer. I like theorems.
I continue to take all the tests. I take my time with them. I enjoy them. I do pretty well.
I finish in about three weeks. It’s time for Linear Algebra. I look online again and end up back at MIT OCW, this time at Gilbert Strang’s 18.06, recorded Fall 1999.
I love linear algebra. I love Gilbert Strang. I felt transported to that classroom in 1999, learning about vector spaces and orthogonality and rank. The relationships between the various mathematical objects are rich and profound. Machine Learning is making more sense. I am intrigued by the properties of determinants.
I continue to process 2-3 lectures per day. In early August, I fly back to California for five weeks: Three in Santa Monica, two in Oakland, one in Black Rock City. Not a bad itinerary. My time in Santa Monica is spent riding my bike, hanging out with my parents, and finding eigenvectors. It is exceedingly pleasant. Gilbert Strang is an amazing lecturer. I multiply a lot of matrices. “Again,” I tell myself, starting on a practice problem. “Again. Again.” I develop intuitions.
By the time I get to Oakland, I have only three lectures and the final left to take. I find it markedly more difficult to focus once I arrive – the excitement, reunions with friends, the adventure of being back in the Bay, proves distracting. I struggle to focus, but eventually do power through and finish. I spend six hours on the final, sitting in a coffee shop on Telegraph Avenue. Again, I do well.
I want to take a moment and acknowledge the debt owed to these three men, and the teams which organized and published their content. The education I received would not have been possible ten years ago; someone in my position would have faced significantly greater obstacles. I feel a closeness to them, as though I had truly been their student. Thanks.
I get back to Brooklyn just in time for the second day of school. My first class is Advanced Machine Learning. The first lecture: a “calibration quiz” meant to prune the class down to size. I am anxious. This class is important to me. I get in.
The next chapter begins.