Please make sure you understand the policy towards Academic Integrity.
Homework assignments are due to the instructor by the start of class on the date specified, unless other arrangements have been made with the instructor.
The table below gives the assignments, and associated dates. All
future dates/topics are tentative until such assignments are made available.
Assignment | Topic | Due Date | How to submit |
---|---|---|---|
Homework 1 | Union-Find, Skip lists, and trees | Feb. 14 | Due by the start of class, either via email or in person submission |
Homework 2 | ALL the trees | March 6 | Due by the start of class, either via email or in person submission |
Students are strongly encouraged to work on projects motivated by their primary research/development interests. In particular, project topics need not be limited to the specific topics covered in class, as long as they focus on data structures. Especially for theoretical projects, you should work on problems whose solution you want to know but don't.
One month before the end of the semester, each student will submit a project proposal. Project proposals are due Friday, April 2; I'll give feedback on each the following week. (Note that this is a hard deadline - no extentions!) Proposals should be 2-3 pages in length, and should include a brief and self-contained statement of the proposed topic or work, a brief survey of known results (which I'll double check), a potential plan of attack, and one or two half-baked ideas that probably won't work but what the heck you might get lucky. After everything is submitted, I will post all submitted proposals to this web page as inspiration for final projects, and we'll spend time in class discussing groups and which projects to pursue.
The ideal result of the project is something that can be polished into a publishable paper. This ideal is meant to be an attractive goal, not an absolute requirement - most research is not immediately successful, after all! If you do not find a complete solution, your writeup and presentation should describe partial results (for example, incremental improvements for some interesting special cases), promising approaches for ongoing work, what you did and where you got stuck in implementation, remaining questions where you're still stumped, and most importantly, ideas that initially looked promising but weren't (and why). Of course, this will vary slightly depending on what type of project you have, so plan to come discuss this with me after proposals are submitted. As a standing rule, I will give any student who publishes in a peer-reviewed conference or journal as a direct result of work in this class an A+. I can and will apply this rule retroactively even after the semester ends (at least until the university won't let me anymore - eventually these freeze).
Go read some papers. Many research papers end with a laundry list of open problems. Even if there's no explicit list, very few papers describe results that cannot be improved in some way: efficiency, generality, elegance, or all of the above. Look especially for the questions that nobody is asking! Reading papers is not only a good source of problems, but also the best way to understand what people already know, what people care about, and what people expect in a research publication. For this class, I'd suggest starting with papers discussed in class, data structures you've used and been interested to learn more about, or feel free to look at other advanced data structures classes.
Tweak an existing result. If the best algorithm is randomized, can you get the same performance deterministically? If the best algorithm is deterministic, is there a simpler randomized solution? Can you efficiently maintain the solution as the input changes? Can that constant 5 be changed to 10000, or to 1+ε? What if you have to pay for individual bit operations? What if you can perform rotations for free? What if you only care about cache misses? Can you solve the same problem for sets of strings as efficiently as for sets of numbers? What if the points are moving? How good an approximation can you compute in linear time? In sublinear time? What if you're only allowed to scan through the data once? What if you can only use constant space or constant time? What if you use ellipses instead of circles? Does it work in higher dimensions? Can you run the algorithm backward in time? In parallel? What if the computer can lie to you occasionally? Can you turn the problem into a game? What if the nodes have colors, the edges have negative weights, the disks are magnetized, the strawberries taste like strawberries, and the snozzberries taste like snozzberries? Don't worry (at first) about whether your questions make sense, or if they're impossible, or if they're trivial. The goal is just to come up with questions.
Talk to your supervisor and/or advisor. Are there data structures that would be useful to understand in more detail, or that they don't have a good implementation of in their chosen language? Are there holes in existing libraries, where a good implementation would be useful? Have they used something and been surprised at how slow or fast the code actually was, and can you test or implement the data structure to find out why?
Wander around at random. Start with an interesting paper. Look up other papers cited in its bibliography. Find the paper on CiteSeer or Google Scholar and look at newer papers that cite those. Go to the authors' web pages and look at their other papers. Google for an interesting phrase or an unfamiliar piece of terminology. Look at other papers in the same journal or conference proceedings, or on the same shelf at the library.
Go look at youtube videos for advanced data structures. First, does your favorite data structure exist? Is it described well? Can you adapt their approach to teach something new, or come up with a better way to explain? Is there an interactive demo?
Go check out wikipedia. Are there entries for all the data structures we cover? Do they actually help you understand the data structure? Can you come up with a way to improve a few of these? (Note that wikipedia tends to be pretty good, in large part because CS students and faculty take the time to improve it, but that doesn't mean it can't be improved!)