Wednesday, March 18, 2015

Double Hashing (Lueker and Molodowitch)

A subject I've grown interested in, related to multiple-choice hashing schemes, is when (and why) double hashing can be used in place of "random hashing" with an asymptotically negligible difference in performance.

One early, useful work on this subject is by Lueker and Molodowitch.  They provide a very nice coupling argument between double hashing and random hashing in the setting of open address hashing in their paper More Analysis of Double Hashing.  (The original paper appeared in STOC 1988.)  In this post I'll summarize their argument.  I apologize that both the text and my exposition might be a little rough. 

They work in the open address hashing setting;  each key runs through a permutation of the table locations when it is being placed, and it placed in the first empty location, with each location holding a single key.  When searching for a key, we run sequentially through its permutation;  we either eventually find the element or we find an empty slot, in which case we know the key was not in the table, and the search was unsuccessful.  We measure the expected time for an unsuccessful search when a table with m slots is loaded with pm keys for a constant fraction p.  For convenience we will have m be prime, as this will simplify matters when we consider double hashing.  If each key's permutation is completely uniform over all permutations, we call this random hashing, and the expected time to search for key not in the table is 1/(1-p) + o(1);  with some work you can get that it is 1/(1-p)+O(1/m), but we will not concern ourselves so much with the low order terms here.

With double hashing, for a key x, the permutation is given by h_1(x)+ j h_2(x) mod m for hash functions h_1 and h_2, where h_1(x) is uniform over the range [0,m-1], h_2(x) is uniform over the range [1,m-1], and the permutation takes the values in the order j=0,1,2,...  This gives a permutation (because m is prime), and with double hashing, you just need two random hash values, which from a theoretical standpoint is "much less randomness" than a fully random permutation, and from a practical standpoint is easier to implement. 

What Lueker and Molodowitch show is that for any (constant) load factor p, with double hashing, the expected time for an unsuccessful search remains 1/(1-p) + o(1).  They show this through a coupling, which shows that double hashing and random hashing can be coupled so the "the same thing happens" -- that is, the key goes into the same slot -- under both double hashing and random hashing most of the time.  Unfortunately, it doesn't happen all the time;  the coupling is not strong enough to say that all the keys are placed the same with high probability.  But they show that they can arrange the coupling so that thing work out nicely just the same.

To start, let us start with a setting where we have loaded our tables with n keys using random hashing, and now take two copies of our state, and consider a single step of random hashing in one copy and double hashing in the other copy, side by side.  Clearly, for random hashing, the probability that a key is placed in any empty slot is 1/(m-n) for each slot.  In expectation (over the random past), by symmetry, for double hashing the expected probability that a key is placed in any empty is 1/(m-n), but the actual probability for each slot will depend on the configuration.  But what they show, using Chernoff bounds, is the the actual probability the key is placed in each slot is at most q/(m-n) for some q that is (1+o(1)), with high probability over the past random placements of the n keys.

Now for the coupling.  Starting from empty, at each step we use double hashing in both of our copies with probability 1/q = 1- o(1).  Note that this ensures that the probability a key is placed in the "random hashing" copy of the process is at most 1/(m-n), so far.  So with probability 1/q, we have placed the key in the same slot in both tables, and so it is as though we've done random hashing for this step.

But what about what happens with probability 1-1/q?  Maybe we could ignore it, if 1/q was 1-o(1/n) for example, as a low probability event;  unfortunately, that's not the case.  In particular, we actually expect that the coupling will fail for some smallish (polylogarithmic) number of steps.

Instead, with probability 1-1/q we place the key so that the step follows random hashing in total.  I'm not saying with probability 1-1/q we place the key at random;  I'm saying we place the key so that, in total (including the 1-1/q probability first step where they key was placed by double hashing) we place the key so that, overall, the probability any empty slot obtains the key is 1/(m-n).  Another way of thinking about this is in the other direction;  my coupling always placed the key according the random hashing, and with probability 1/q (which again is very close to 1) that matches what would be done with double hashing.

So in our random hashing copy of the table, we just placed a key according to random hashing.  How should we think of what is happening over in the double hashing copy?  For that table, with probability 1/q all went fine -- a key was placed by double hashing -- and with probability 1-1/q some key just dropped into the table that wasn't placed by double hashing.  It's like an extra present from above.  But it's not a key placed by double hashing.

The next part of the argument is to recognize that that's OK, in the following sense.  If you simply add a key anywhere is an open addressed hash table, you just make things worse, in a very specific way.  Any slot in the table that would have been filled if you hadn't put in that key will still be filled at the end of the process even when you add that key.   That is, if S is the set of slots that would contain a key if no extra keys get placed, and S' is the set of slots that contain a key if you, at various points in the process, just add some extra keys anywhere at any point, then a simple induction gives S is a subset S'.

So now let's consider multiple steps of this coupling.  At each step, the ball is actually placed according to random hashing, so at every point in the process, the "state" is that of a random hashing process.  On the double hashing side of the coupling, with probability 1/q a ball was placed by double hashing, and with probability 1-1/q an extra ball was just placed.  So if we count the number of balls placed by double hashing, when we reach the time when n keys have been placed by double hashing in this process, on average n/(1-1/q) = n(1+o(1)) keys (in expectation -- by Chernoff bounds one can get a high probability result) have been placed overall.

The result:  placing n keys by double hashing is stochastically dominated (in terms of the keys that have been placed) by placing n(1+o(1)) keys by random hashing.  In particular, after we place n=pm keys using double hashing, the expected time for an unsuccessful search is bounded above the expected time for unsuccessful search after putting in pm+o(pm) keys using random hashing, which is 1/(1-p) + o(1).  You can do a similar sort of coupling to show that double hashing stochastically dominates placing n(1-o(1)) keys by random hashing.  As a result, asymptotically, there's only an o(1) difference in terms of the expected time for unsuccessful search, a result which explains the negligible difference in performance one sees in implementation.





  


Monday, March 16, 2015

Power of Randomness at Georgia Tech

I'm spending (part of) the week at "The Power of Randomness in Computation Workshop", an IMA (Institute for Mathematics and its Applications) and ARC (Georgia Tech Algorithm and Randomness Center) co-sponsored workshop at Georgia Tech.  Here's the schedule.  I'm told slides will eventually be put up somewhere on the IMA website for such things.  Great organization at Georgia Tech -- a big crowd in a very nice room, lots of food and coffee, all very well organized.  They even had Ben Affleck waiting in front of the building for us this morning.  He seemed to be a little busy shooting a movie to greet us properly, but maybe he'll have a bit more time to chat tomorrow.  

Besides Ben, a few other highlights:

Leslie Goldberg started things of talking about the complexity of approximating complex-valued Ising and Tutte partition functions.  I remember the Ising/Tutte models (mostly from graduate school and shortly after);  now there are connections between various problems in quantum computing and these functions on complex values, which (or course) I had not known.

Nike Sun gave a talk on the exact k-SAT threshold (for large k).  It was very clearly presented and gave the argument at the intuitive level.  I gained some insight into why the "locally random tree" type argument I've enjoyed in coding/belief propagation arguments breaks down in certain satisfiability problems, due to clustering of solutions and other challenging correlations, and how those issues can be handled. I started to understand (I think) the point of replica symmetry breaking arguments and how they were used to guide the analysis of the k-SAT problem.

Other talks from the day:  Amin-Coja Oghlan also talked about replica symmetry techniques and their uses for random graph coloring problems, Eli Upfal talked about some new shuffling techniques for oblivious storage dubbed the Melbourne shuffle, Aravind Srinivasan gave a talk on the Lovasz Local Lemma (staring from the Moser-Tardos results and showing how these arguments carry forward and give greater power and insight into the use of the LLL for additional problems), and I talked about invertible Bloom lookup tables and briefly mentioned a few other unrelated things in progress.

  



Tuesday, March 03, 2015

Hate EasyChair

I just typed in a nice long review on EasyChair.  Yes, I prefer doing this with the online form when I'm just sitting around and have time to do a review. 

Apparently I didn't hit one of the score buttons (although I'm pretty sure I did, let's give EasyChair the benefit of the doubt there) so EasyChair says there's an error and, of course, forgets my nice long typed review when it takes me back to the review page, so I'll get to re-do and re-type it later. 

Sigh.  I guess I'll go back to doing my reviews in a text file and cutting and pasting.  No, this has not happened to me in recent memory in HotCRP.  Put this down as one more reason (but not the only one) why I don't like EasyChair and would prefer a better designed system (like HotCRP...).


Wednesday, February 11, 2015

New Heapable Subsequence Paper

In the "only a dozen people could care about this category"...

About 4 1/2 years ago, I posted about a paper we had put up on the arxiv about Heapable Sequences and Subsequences.  The basic combinatorial structure we were looking at is a seemingly natural generalization of the idea of Longest Increasing Subsequences.  Say that a sequence is heapable if you can sequentially place the items into a (binary, increasing) heap, so each new item is the child of some item already in the heap.  So, for example, 1 4 2 3 5 is heapable, but 1 5 3 4 2 is not.  Once you have this idea, you can ask about things like the Longest Heapable Subsequence of a sequence (algorithms for it, expected length with a random permutation, etc.).  Our paper had some results and lots of open questions.

I admit, when we did this paper I was hoping that some combinatorialist(s) would find the notion compelling, take up the questions, and find some cool connections.  Longest Increasing Subsequences are somehow related to Young tableaux, interacting particle systems, and all sorts of other cool things.  So what about Longest Heapable Subsequences?

I had to wait a few years, but Gabriel Istrate and Cosmin Bonchis recently put a paper up on the arxiv that makes these connections.  Here's the abstract:
We investigate partitioning of integer sequences into heapable subsequences (previously defined and established by Mitzenmacher et al). We show that an extension of patience sorting computes the decomposition into a minimal number of heapable subsequences (MHS). We connect this parameter to an interactive particle system, a multiset extension of Hammersley's process, and investigate its expected value on a random permutation. In contrast with the (well studied) case of the longest increasing subsequence, we bring experimental evidence that the correct asymptotic scaling is 1+52ln(n). Finally we give a heap-based extension of Young tableaux, prove a hook inequality and an extension of the Robinson-Schensted correspondence.
(Note, that should really be "Byers et al...")  

I love the new conjecture that the expected minimal number of heapable subsequences a random sequence decomposes into is ((1+sqrt{5})/2) ln n.  (It's clearly at least ln n, the expected number of minima in the sequence.) 

There are still all sort of open questions, that seem surprisingly difficult;  and I certainly can't claim I know of any important practical applications.  But Longest Heapable Subsequences just appeal to me as a simple, straightforward mathematical object that I wish I understood more.

For simple-sounding but apparently difficult open questions,  as far as I know, the answer to even the basic question of "What is the formula for how many sequences of length n are heapable?" is still not known.  Similarly, I think the question of finding an efficient algorithm for determining the Longest Heapable Subsequence (or showing it is hard for some class) is open as well.  


Monday, December 15, 2014

Stress: Competition and Ranking

One issue I keep seeing in comments here and elsewhere on this issue is that academia is very competitive, with everyone worried about their rank.  In my last post, I admitted that it would be hard to completely deny that there is competition, particularly when people are younger, which tends to come out when jobs are at stake.  But, for the most part, I think the role of competition is completely exaggerated, strangely so.  Academia -- at least, certainly, my branch of it -- thrives on collaboration, and I believe people who go into it thinking that it is a big competition are going to lose out, both professionally and in their enjoyment of the profession.  (Indeed, one of the reasons I write these posts is because I don't want undergraduate and graduate students getting what I think is a one-sided, incorrect view of how academics work.)

First, I would again like to compare academics with other professions.  I've seen comments (including on this blog) that other professions are less competitive than academia, and people are less worried about rank.  I think people who are making that suggestion need to check in with people working in those professions, because I think they're ridiculously wrong.  Lawyers coming out of school to get jobs, doctors trying to get fellowships or residencies, consultants at consulting firms -- all very competitive.  As you move up, this continues.  For lawyers, there's who can bill the most hours, and the challenge to make partner;  for doctors, who can get the best positions at hospitals;  and for businesspeople, every promotion is a step up.  Even for software engineering types, there's competition.  When I was a graduate student, I recall visiting friends who had gone to a large well-known company, and for a large part of the evening all they talked about was what "level" they and others were in the company and the annual reviews and who was doing what project that might get them ahead.  So let's not be delusional and start by acknowledging that there's competition everywhere, and that's unsurprising when jobs and money are at stake.  While I'm not suggesting I have deep knowledge of all of these careers, I think academics have much less competition than most.  

If academics appear like they're concerned about ranking, perhaps it's because they appear to be easy to rank.  First, as I pointed out last post, there's not that many of us.  Second, there are obvious metrics everyone can understand:  number of papers published, number of papers appearing in "top" conferences, and h-index stand out.  I'm not suggesting these are good metrics -- but they're easy and to a first order give (potentially) some useful information.  They're a quick way of bucketing or sorting people, particularly those without an established track record and are therefore not necessarily widely known or visible in the field, and therefore have more of an impression and an impact on younger academics.

But very quickly after your PhD, this sort of ranking loses its importance, and the very idea of ranking starts to lose its value -- as many have noted in a variety of venues.  In academia, there's lots of ways to be successful, many points on the Pareto frontier.  There are many great results waiting to be found and many different subfields to work in.  At the end of the day, a history of good work and specific achievements is what people look for;  there's not really a finite pool of such things for which to compete.  Indeed, I'm not sure how I would go about competing with the top people in the field, except to try to do interesting work, which is what I'm trying to do anyway.  (A secondary way to compete is just to make sure people know about your work.  But giving talks is less important to being successful than doing the work that goes into the talks in the first place;  again, it can have a bigger impact for people in the early stages of their career.) 

Against this idea of competition, just look at how people in academia work together.  In computer science theory, in particular, most papers have several authors working together.  In a number of cases these are students with their advisors, but a closer look reveals that in many cases, they are not.  Credit can be shared easily in academia, and collaborations can lead to results that individuals could not get alone.  Working in groups is a way for people to get more done.  Instead of competition, collaboration often yields the path to having a greater impact on the field.  Rather than being a "competitive game", research is more a "cooperative game".  (As an aside, this is why theory's approach of alphabetical order for authors rather than some sort of implicit "credit scheme" based on author order makes such great sense.)  In most areas of computer science that I've participated in, a similar spirit prevails.

I encourage graduate students to team up pick out projects to work on together (and have seen this at other places, also -- one of my best experiences as a graduate student was such a project).  It gives them something to do and a sense of ownership outside of working with their advisor.  And, importantly, in reinforces that these other students are their colleagues, and that working together is a great idea that can gain for everyone.  Hopefully, they also learn that working together is more fun and generally more productive than working alone.  When it comes to hiring time, it's nice to see students who have worked on such team projects, because I typically prefer colleagues with a track record of working well with others.     

Sometimes small competitions break out, sure -- multiple groups are working on the same or similar problems.  Often, though, this is a very healthy competition, pushing progress forward in an area over a series of papers.  I remember in the past an argument with another group when we were working on similar problems and an issue of "credit" in the writeup of the various papers came up.  A week later, we were starting collaborations together on new ideas.  That's not exactly the sign of a super-competitive landscape. 

It could be that I've just got a mistaken impression of the field.  Harvard is a wonderfully collaborative place, and personally I've found overall I like working with others more than on my own.  But when I think of cutthroat competition, I don't think of the job I'm in.

To conclude the post, I think what may be going on is people confuse "competition" with "drive".  Most academics are smart, successful people, who are internally driven by a desire to do great work.  To many, that must appear like "competition", but if so, it's internal competition -- you're not out to beat others, but to be our best.  And I think it's very possible academia does have more "Type A" personalities that have this internal drive that is, surely, not always healthy.  It's not clear to me that this academia's fault -- such people would be similarly driven in any career -- but, if it is true, then it suggests we might consider if this is best for our field, and how we might open up the field to a wider set of personalities or how we might make work in this field healthier for this type of personality.  





Wednesday, December 10, 2014

Stress, Continued: Jobs

Continuing from last post, I aim to focus on some of the issues related to stress in academia.  Today's post will be related to stress and the nature of academic employment.

The most stressful issues I can think of in academia relate to finding or keeping your job.  Graduating and finding the first job, especially when the job market is tight, cannot help but lead to stress for most people.  (Indeed, these days, the process seems to be getting worse -- as postdocs become normalized in Computer Science, many people have to find the their "first job" two or more times.)  Similarly, when you come up for tenure, it's obviously very stressful.  Even if you think you deserve tenure, there is uncertainty because the process it outside your control, and the 1-bit decision can have a tremendous impact.  While changing jobs later can also be stressful, and I'll discuss that, these two times seem to be the worst.

As a starting point, however, compared to other professions, the issues related to getting a first job and tenure are not especially unique to academia.  Doctors have residency and specialization after medical school, lawyers wait years to become partner.  Business people have it different, perhaps;  tech startups notwithstanding, there are people who still climb up the corporate ladder.  The tenure situation is, arguably, a more severe end goal than in other professions, but with what seems to be a commensurate reward;  you don't really have to worry about a job after, if you are willing to stay at your institution.  The framework and corresponding stress seem comparable to many other career paths, although there are career paths that avoid such poignant milestones.  In particular, in computer science, many students from top institutions can quickly and readily find work at major companies or startups, and their stress seems to come from having a wealth rather than dearth of choices.

Would there be changes to the system that would help?  I don't think so, and this requires some explanation.  The issue of job stress, to me, seems fundamentally an issue of supply and demand.  In the top 20 computer science departments there are approximately 1000 professor jobs.  Heck, maybe I'm off by a factor of 2, depending on how you count (tenured vs all, maybe top 30 instead of 20).  Positions turn over very slowly.  In short, academia is a niche market, with a small supply of jobs and, generally, heavy demand for them.  This creates a lot of friction in the system.

In years past, we've seen a slowdown in the CS academic job market.  Even small slowdowns create tough situations with the small job supply.  The solution was to introduce more postdocs.  It was a working solution for the time, but with risks and downsides -- a de facto postdoc requirement added into the employment picture? -- that leaves us with as many questions as answers.

Similarly, tenure seems like such a huge deal in large because one cannot readily move to another position easily.  Yes, there is the issue of being rejected, and the corresponding loss of prestige that goes with it, but even ignoring that, the challenge of where to go (in academia) if tenure is not granted looms large.  The problem extends past tenure.  Even very good people can find it hard to move, as the small number of jobs available make for an inflexible, challenging job market.  If I was at Google and wanted to work at Facebook or some other company, such a move should not, generally, be difficult;  people make such switches pretty regularly.  If I walked over to MIT and told them I would like to move there, there are huge barriers, perhaps the most obvious of which is MIT has a wealthy choice of wonderful people it could choose to hire instead, and is careful in choosing who it hires to one of those limited slots.  Indeed, at most other schools, the obvious issue is whether there would even be a senior position available, but MIT computer science always seems to have a position if it wants one.*  The upshot is if later in life a professor wants to switch jobs for any number of personal reasons (dissatisfaction with the current location, divorce or family issues, etc.), it's not always easy or possible to do and stay in academia.   The problem is again related to scarcity in jobs, and solving that problem seems out of reach, involving changes to institutional or societal structures.  (That is, we have to convince our universities we need to hire more CS faculty, and/or we need to convince society to spend more on research/professors/education.)

The job stress issue is the most prominent stress issue I see in academia, and I think it underlies a lot of other stresses.  When people say academics are over-concerned with and spend too much time jockeying for being ranked highly -- a point I have issues with and I'll return to in a later post -- to the extent that's true, I think some of it is inescapable in such a job market.  When you graduate, you'll be compared to other graduates and that will have an effect on what jobs you can get.  When you are up for tenure, you'll be compared to peers, implicitly or explicitly.  If you want to switch jobs post-tenure, how you compare to people at the institution you wish to move to and to others in your field is important.  All of these comparison points become more important in a tight, friction-filled job market.  As much as I'd like that not to be the case or deny it, I think it's better to face the reality of it.

What possibility is there for a solution?  The easiest I can think of is to expand the job market, which I think comes from industry jobs.  As part of this, we have to help make sure industry sees the value and importance of research, and of having its own researchers.  Some people have said that academics look down on "non-academic" employment for students and especially PhDs.  I don't think that's generally true in CS, but to the extent it is, that's setting unrealistic expectations for outcomes for many -- or most -- of our students.  The virtues of jobs in industry are well documented, as are the virtues of academic jobs;  maintaining a culture where both are seen as positive outcomes seems healthiest for stress levels of individuals and for the health of the CS academic community generally.  

Other solutions welcome and more to come.

* Just kidding around with you, MIT.  But seriously, is there any upper bound on you all?

 
    


Tuesday, December 09, 2014

Stress

I've seen lots of news and posts recently on the general theme of stress in academia. Daniel Lemire blogs about Stefan Grimm, a professor who recently committed suicide, and talks about academia as an anxiety machine.  More close to home, we recall that Seth Teller committed suicide earlier this year.  Previous Harvard faculty member and loud blogger Matt Welsh frequently portrays academia as a stressful place (most recently discussing the Fame Trap), and recently Harvard's Radhika Nagpal offered her advice about the scary myths of academic life and how to find happiness.

So here is my question, and I don't think it's a simple one:  is the academic world a stressful place?  The above are an interesting set of anecdotes, but there are certainly contradictory anecdotes. 

For example, as a starting point we might look to the CareerCast list of least stressful jobs, where university professor was judged the least stressful job in 2013 (and nearly the least in 2014), and the reaction to it.  (A nice Forbes article here, and last year's related article here with hundreds of comments.  Inside Higher Ed articles here (2014) and here (2013).)  The big quote this year from Inside Higher Ed:  
Tony Lee, publisher, CareerCast, added via email: "We received a lot of feedback about our ranking of university professor as a low-stress job. But we found that while adjunct and part-time teachers are right that their jobs can be stressful, the stress levels for tenured university professors – which is what we rank – are lower than the majority of other jobs we measure in our report."
My own take on this amusement (before we get to the serious) is that they probably underestimate the stress levels of professors -- even tenured, university profs.  On the other hand, since CareerCast's "most stressful" jobs include military, police, firefighters, and event coordinators, I'm not going to argue against the point that university professors may have it comparatively easy.

Now, more seriously, is life working in academia stressful?  I think it's an important question;  to the extent that academia is more stressful than it should be, it may be damaging both to individuals, and to the overall success of scientific productivity on a larger scale.  On the other hand, perhaps the level of stress in academia is natural for various reasons, including the type of work involved, or the type of people involved, and is not really so bad either in relative or absolute terms.   

I don't know the answer to this question.  I could speak for myself (this is my blog, after all), but that's just more anecdote.  I poked around a little bit but didn't see anything especially informative in the research literature I saw;  I'd be happy to have pointers suggested.  Also, it's a problematic question, for several reasons, as there are all sorts of underlying questions to answer first.

1)  Stressful compared to what?  What are the right comparisons?  I tend to think the comparisons should be with other professional jobs (doctor, lawyer, businessperson -- and, for CS, I guess the appropriate "software engineer" type title), which seem to me to pretty stressful employment options in their own right. 

2)  Stressful at what time in one's career?  I think there are different stressors, and different amounts of stress, at different career points.  Grad student stress is not the same as assistant faculty stress is not the same as tenured faculty stress.

3)  How would one measure "stress", anyway?  CareerCast gives some information on its criteria which leads to university professor being (one of) the least stressful jobs.  Daniel Lemire offers the week-end-freedom test for how free you are (which may be one way of checking stress), but I don't agree with his assumptions. 

4)  And how might one account for various selection effects?  Maybe academics are more generally "Type A personalities".  Perhaps the right questions are not why academics are stressed, but why does academia attract so many Type A personalities, and/or does academia reward Type A personality-types in ways that are detrimental.

5)  Are Computer Science/Engineering in academia different in terms of stress than other academic fields, since these are the areas I care most about?  (Ostensibly, for instance, we get paid higher salaries, and have ample opportunities for outside-university work.  Both of these could affect stress levels.)

I'm sure you can come up with more issues that complicate the basic question.

I may do some more posts on this basic theme;  I do think the topic is important, but there's little hard information.  Seeing as how there's lots of psychologists hanging around universities, I'm surprised that I didn't quickly find one or more recent comparative studies on stress levels, personality types, etc. for academics, but perhaps I just need to look harder.  (If only I had the time!) 

To be clear, my bias is that when I read blog posts from my respected colleagues that portray academic life as some sort of competitive pressure-cooker where everyone is out to increase their ranking and are working 80+ hour weeks without any sort of social life, that's just not the reality I am aware of.  (Including, I think, for the people who actually write these posts, who generally seem to be talking about "other people".) 

To cut off some potential responses, I'm not blind, nor stupid;  there are plenty of stresses in academia, in particular having to do with the employment situation, where the number of jobs is not sufficient for the number of potential job-seekers generated.  And I would be interested in finding useful ways to understanding stress in the field and how to reduce it productively, especially for graduate students.  But to do this, we would need to move beyond individual anecdotes, gain some more data-driven understanding of what's going on, and start determining best practices.  The mythologies about academic life and stress do not seem helpful in understanding where there are problems and how to address them.  





Monday, December 08, 2014

CS 50 Fair (Live)

The madness that is the CS50 Fair at Harvard is going on now.  Hundreds of students demo-ing their final projects.  If you can't be there, you can watch the live stream.  I did a brief stop-by this morning and saw some fun and useful stuff -- one student showed me a simple setup for students to ask questions online in class and have other students up-vote them so the prof can see popular questions.  (There are tools that do this, sure, but it would be nice to have a simple one-off easy-to-use code setup to do this on the fly.  Hey -- student -- if you're reading this, come see me, I want your code. )  Another had a nice interface that gathered information on all the various Harvard events from different websites and pulled them together with some basic search functionality.  I'll have to talk with David Malan about how we're doing transitioning some of these useful prototypes into tools for us or around campus.  Still amazing what energetic CS students can do after an introductory class.  And hopefully I'll go back and see another shift of projects later this afternoon.




Tuesday, December 02, 2014

Yet More Good News, Harry Lewis Edition

As mentioned previously, our current Dean for the School of Engineering and Applied Sciences, Cherry Murray, will be stepping down at the end of the year.

The good news out today is that Professor Harry Lewis will be stepping up to serve as the Interim Dean.  To those in the outside world, you may know Harry for his blog, his books (CS books like Elements of the Theory of Computation (2nd Edition) and Data Structures and Their Algorithms, or more popular books like Excellence Without a Soul: Does Liberal Education Have a Future?, Blown to Bits: Your Life, Liberty, and Happiness After the Digital Explosion, and Baseball As A Second Language), or his time as the Dean of Harvard College. 

For those of us who have been students (or professors, or both) at Harvard, we know Harry as a mentor, a guide, a wise man to come to for advice, and an inspiration.  (Both Salil and I frequently mention Harry Lewis and CS 121 at Harvard as reasons why we ended up as computer scientists.)

We are very fortunate that Harry stepped up to take on this important role, and his work will let SEAS -- and CS at Harvard -- continue to flourish.  As Harry says in the Crimson article above, "“The future of engineering at Harvard has never looked brighter than it does today."  And as FAS Dean Michael Smith says, “He brings to the deanship a thorough understanding of SEAS culture and academic life, exceptional insight into the undergraduate experience as a former dean of Harvard College, and the intellectual and administrative agility needed to effectively guide SEAS during this interim period.”