Course Home | Course Policies | Assignments | Schedule & Lecture Notes

CS 5930: Advanced Data Structures
Spring 2020

HOMEWORK ASSIGNMENTS

I expect to assign about 3-4 homework assignments during this course.

Please make sure you understand the policy towards Academic Integrity.

Homework assignments are due to the instructor by the start of class on the date specified, unless other arrangements have been made with the instructor.

The table below gives the assignments, and associated dates. All future dates/topics are tentative until such assignments are made available.

Assignment Topic Due Date How to submit

Homework 1 Union-Find, Skip lists, and trees Feb. 14 Due by the start of class, either via email or in person submission

Homework 2 ALL the trees March 6 Due by the start of class, either via email or in person submission

Assignment	Topic	Due Date	How to submit
Homework 1	Union-Find, Skip lists, and trees	Feb. 14	Due by the start of class, either via email or in person submission
Homework 2	ALL the trees	March 6	Due by the start of class, either via email or in person submission

Final project proposals

(Note that these have been shared with permission of the individual students. If you prefer not to share, let me know and I'll remove!)

Eric Adams and Skyler Howe: Educational van Emde Boas Tree Implementation
Sadiya Ahmad: Modified Van Emde Boas trees
Cletus Ajibade: Graph coloring data structures
Rahul Chauhan: Visualization for Splay Trees and B Trees
Aman Khalid: Analyzing performance of Cassandra with Cuckoo Filter: Later work on this project resulted in a paper published in the IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN).
Qingjie Lu: Visualization of Fibonacci Heaps
Sagar Rajashekar: Cache Oblivious queries for Efficient IP Routing
Chao Ren: Splay trees and clustering

Final project

Grades for this class will be based primarily on a final project. Projects may be submitted by teams of up to three students, or may be done individually. However, if possible, students are strongly encouraged to collaborate outside their teams, with anyone in or out of class (with proper credit, of course). Each team will submit a written document (expect about 10-15 pages, although that's flexible) and give a short presentation or poster display in the final week of class. Projects can take one of many forms:

Theoretical work: Attempt to solve an interesting, non-trivial, and preferably open theoretical problem related to data structures.
Experimental: Implement and experimentally evaluate several data structures for some problem. Projects of this type can include new data structures that are hard to analyze, or that are practically efficient (in some interesting context) despite theoretical inefficiency.
Scholarly: Write a comprehensive survey of a topic related to data structures. Surveys should be about 15-20 pages long, and should include a history of the topic, a description of motivating applications, a summary of known results, sketches of the most important data structures and proofs, suggestions for further research, and a thorough bibliography. Pictures are also highly encouraged to illustrate the workings!
Creative: Make a youtube video, or interactive demo of an advanced data structures topic, preferably one that doesn't have a lot of information available on the web. Or you can substantially improve the Wikipedia articles for an advanced data structures topics. Or come see me if you have other creative ideas! The idea is to show significant knowledge on some advanced data structures topic, but I'm prepared to be flexible for this category.

Students are strongly encouraged to work on projects motivated by their primary research/development interests. In particular, project topics need not be limited to the specific topics covered in class, as long as they focus on data structures. Especially for theoretical projects, you should work on problems whose solution you want to know but don't.

One month before the end of the semester, each student will submit a project proposal. Project proposals are due Friday, April 2; I'll give feedback on each the following week. (Note that this is a hard deadline - no extentions!) Proposals should be 2-3 pages in length, and should include a brief and self-contained statement of the proposed topic or work, a brief survey of known results (which I'll double check), a potential plan of attack, and one or two half-baked ideas that probably won't work but what the heck you might get lucky. After everything is submitted, I will post all submitted proposals to this web page as inspiration for final projects, and we'll spend time in class discussing groups and which projects to pursue.

The ideal result of the project is something that can be polished into a publishable paper. This ideal is meant to be an attractive goal, not an absolute requirement - most research is not immediately successful, after all! If you do not find a complete solution, your writeup and presentation should describe partial results (for example, incremental improvements for some interesting special cases), promising approaches for ongoing work, what you did and where you got stuck in implementation, remaining questions where you're still stumped, and most importantly, ideas that initially looked promising but weren't (and why). Of course, this will vary slightly depending on what type of project you have, so plan to come discuss this with me after proposals are submitted. As a standing rule, I will give any student who publishes in a peer-reviewed conference or journal as a direct result of work in this class an A+. I can and will apply this rule retroactively even after the semester ends (at least until the university won't let me anymore - eventually these freeze).

Hey, wait! How do we find good problems?

Excellent question! Here are a few hopefully useful suggestions, and I'm happy to discuss with you further as needed:

Go read some papers. Many research papers end with a laundry list of open problems. Even if there's no explicit list, very few papers describe results that cannot be improved in some way: efficiency, generality, elegance, or all of the above. Look especially for the questions that nobody is asking! Reading papers is not only a good source of problems, but also the best way to understand what people already know, what people care about, and what people expect in a research publication. For this class, I'd suggest starting with papers discussed in class, data structures you've used and been interested to learn more about, or feel free to look at other advanced data structures classes.
Tweak an existing result. If the best algorithm is randomized, can you get the same performance deterministically? If the best algorithm is deterministic, is there a simpler randomized solution? Can you efficiently maintain the solution as the input changes? Can that constant 5 be changed to 10000, or to 1+ε? What if you have to pay for individual bit operations? What if you can perform rotations for free? What if you only care about cache misses? Can you solve the same problem for sets of strings as efficiently as for sets of numbers? What if the points are moving? How good an approximation can you compute in linear time? In sublinear time? What if you're only allowed to scan through the data once? What if you can only use constant space or constant time? What if you use ellipses instead of circles? Does it work in higher dimensions? Can you run the algorithm backward in time? In parallel? What if the computer can lie to you occasionally? Can you turn the problem into a game? What if the nodes have colors, the edges have negative weights, the disks are magnetized, the strawberries taste like strawberries, and the snozzberries taste like snozzberries? Don't worry (at first) about whether your questions make sense, or if they're impossible, or if they're trivial. The goal is just to come up with questions.
Talk to your supervisor and/or advisor. Are there data structures that would be useful to understand in more detail, or that they don't have a good implementation of in their chosen language? Are there holes in existing libraries, where a good implementation would be useful? Have they used something and been surprised at how slow or fast the code actually was, and can you test or implement the data structure to find out why?
Wander around at random. Start with an interesting paper. Look up other papers cited in its bibliography. Find the paper on CiteSeer or Google Scholar and look at newer papers that cite those. Go to the authors' web pages and look at their other papers. Google for an interesting phrase or an unfamiliar piece of terminology. Look at other papers in the same journal or conference proceedings, or on the same shelf at the library.
Go look at youtube videos for advanced data structures. First, does your favorite data structure exist? Is it described well? Can you adapt their approach to teach something new, or come up with a better way to explain? Is there an interactive demo?
Go check out wikipedia. Are there entries for all the data structures we cover? Do they actually help you understand the data structure? Can you come up with a way to improve a few of these? (Note that wikipedia tends to be pretty good, in large part because CS students and faculty take the time to improve it, but that doesn't mean it can't be improved!)

CS 5930: Advanced Data Structures Spring 2020

HOMEWORK ASSIGNMENTS

Final project proposals

Final project

Hey, wait! How do we find good problems?

CS 5930: Advanced Data Structures
Spring 2020