I’m writing this as I wait to be assigned to an RCC compute node for my thesis work. It took me the better part of a day, but I think I know how to solve my large-scale entropy calculation conundrum. And yes, the answer is sparse matrices. Yet again.
But this post isn’t about that extolling the virtues of sparse matrices. The rough idea got me thinking back on how much of my work in the last year vis-a-vis efficiency boiled down to the use of smart datastructures. With my dataset, these gains were critical. This got me thinking about my formal CS education, or the lack of it.
My intro to CS was the standard first-year Intro to Programming at KGP. I don’t remember much of it, other than the fact that it was taught in C (which I’ve completely forgotten), and that students with some past experience with programming (at the CBSE or ICSE level) did comfortably better than me. And that’s a shame, because I sure could use those fundamental concepts right now.
I had no idea when I’d use a linked-list or a hash table in my first year, and I never paid any attention to it. I only picked up programming again much later, first with statistical computing. As I’ve drifted to data science over the last few years, I’ve had to re-learn and re-discover these concepts the hard way: estimaing computational complexity, the bane of easy to implement but inefficient data structures, the allure of parallelization, and crafting optimal queries.
This is when I realized that it’s only with experience that you truly appreciate the importance and beauty of fundamentals. Sure, in an abstract sense, the logical cleanliness of these structures should have appealed to me. But that didn’t work for me, and I had to learn this the hard way. Which makes me wonder: How many iterations of styles did Tim Duncan go through before he decided to live and die by the fundamentals? Or was he convinced by the simplicity and efficacy of those concepts without ever having tried showboating?
Maybe I should take an algorithms and data structures course finally, right?
Anyway, my compute node beckons me now.