Please make sure you understand the academic integrity policy.
Homework assignments are due to the instructor by the start of class on the date specified, unless other arrangements have been made with the instructor. Some are stand-alone, while others will be a part of the final project - see description below.
The table below gives the assignments, and associated dates. All future dates/topics are tentative until such assignments are made available.
Assignment | Topic | Due date | Submission instructions | Solutions |
---|
Students are strongly encouraged to work on projects motivated by their primary research/development interests. In particular, project topics need not be limited to the specific topics covered in class, as long as they focus on data structures. Especially for theoretical projects, you should work on problems whose solution you want to know but don't.
One month before the end of the semester, each student will submit a project proposal. Project proposals are due on October 27; I'll give feedback on each the following week. (Note that this is a hard deadline - no extentions!) Proposals should be 2-3 pages in length, and should include a brief and self-contained statement of the proposed topic or work, a brief survey of known results (which I'll double check), a potential plan of attack, and one or two half-baked ideas that probably won't work but what the heck you might get lucky. After everything is submitted, I will post all submitted proposals to this web page as inspiration for final projects, and we'll spend time in class discussing groups and which projects to pursue.
The ideal result of the project is something that can be polished into a publishable paper. This ideal is meant to be an attractive goal, not an absolute requirement - most research is not immediately successful, after all! If you do not find a complete solution, your writeup and presentation should describe partial results (for example, incremental improvements for some interesting special cases), promising approaches for ongoing work, what you did and where you got stuck in implementation, remaining questions where you're still stumped, and most importantly, ideas that initially looked promising but weren't (and why). Of course, this will vary slightly depending on what type of project you have, so plan to come discuss this with me after proposals are submitted. As a standing rule, I will give any student who publishes in a peer-reviewed conference or journal as a direct result of work in this class an A+. I can and will apply this rule retroactively even after the semester ends (at least until the university won't let me anymore - eventually these freeze).
Talk to people! This is probably the single most important suggestion I can make. Other people have ideas, intuition, and experience that you don't; just as importantly, you have ideas, intuition, and experience that other people don't! Tell other people your half-baked ideas and listen to theirs. See if they know how to solve your random questions, and try to answer theirs. Ask for suggestions for papers to read, or techniques to try, or problems to think about, and offer your own in exchange. If you're really lucky, you'll find that you have enough background in common to work together. Great!
Talk to your supervisor and/or advisor, and talk to me! Are there places in your research that TDA would be useful? What libraries or tools could you investigate? What type of implementations or tools don't exist?
Go read some papers. Many research papers end with a laundry list of open problems. Even if there's no explicit list, very few papers describe results that cannot be improved in some way: efficiency, generality, elegance, or all of the above. Look especially for the questions that nobody is asking! Reading papers is not only a good source of problems, but also the best way to understand what people already know, what people care about, and what people expect in a research publication. For this class, I'd suggest starting with papers discussed in class, data structures you've used and been interested to learn more about, or feel free to look at other advanced data structures classes.
Tweak an existing result. If the best algorithm is randomized, can you get the same performance deterministically? If the best algorithm is deterministic, is there a simpler randomized solution? Can you efficiently maintain the solution as the input changes? Can that constant 5 be changed to 10000, or to 1+ε? What if you have to pay for individual bit operations? What if you can perform rotations for free? What if you only care about cache misses? Can you solve the same problem for sets of strings as efficiently as for sets of numbers? What if the points are moving? How good an approximation can you compute in linear time? In sublinear time? What if you're only allowed to scan through the data once? What if you can only use constant space or constant time? What if you use ellipses instead of circles? Does it work in higher dimensions? Can you run the algorithm backward in time? In parallel? What if the computer can lie to you occasionally? Can you turn the problem into a game? What if the nodes have colors, the edges have negative weights, the disks are magnetized, the strawberries taste like strawberries, and the snozzberries taste like snozzberries? Don't worry (at first) about whether your questions make sense, or if they're impossible, or if they're trivial. The goal is just to come up with questions.
Go look at youtube videos on TDA - the AARTN online talks are a great place to start. Is there a problem or application you find interesting that they discuss there? Go find the paper, and then read my next suggestion!
Wander around at random. Start with an interesting paper. Look up other papers cited in its bibliography. Find the paper on CiteSeer or Google Scholar and look at newer papers that cite those. Go to the authors' web pages and look at their other papers. Google for an interesting phrase or an unfamiliar piece of terminology. Look at other papers in the same journal or conference proceedings, or on the same shelf at the library.
Go check out wikipedia. Are there entries for all the topics we cover? Do they actually help you understand the tool and/or algorithm? Can you come up with a way to improve a few of these? (Note that wikipedia tends to be pretty good, in large part because CS students and faculty take the time to improve it, but that doesn't mean it can't be improved!)
Solve some problems. Ultimately, the only way to learn how to find good problems, and to learn how to solve them, is by actually doing it. Pick a problem. Try to solve it. If you succeed, what other problems does your solution suggest? What other problems seem likely to fall to your solution technique? If the solution was too easy, tweak the problem to make it harder. If you get stuck, try to formalize why you're stuck; you might be trying to do something impossible! Or tweak the problem to make it easier—consider a useful special case, or ignore the cost of some pesky operation. It's relatively rare for any open problem to be solved in its original stated form; more often, the problem and its solution evolve together as the research progresses.
Write everything down. Keep a research notebook with you at all times. (I strongly recommend paper over anything electronic, but use something you feel comfortable with.) Whenever you see, read, hear, or think of something even remotely interesting or relevant to your research, write it down. Especially write down half-baked ideas and "stupid" questions. Periodically read through your own notes; you'll be surprised at how much stuff you see (and think of!) in a single year. As you mature as a researcher, the content of your notebook will slowly will drift away from ideas you get from other people and toward your own thoughts and discoveries.