Assignments

[Note: this webpage last modified Friday, 09-Dec-2011 08:28:56 EST]

Homework assignments will be posted to this website. Homework assignments are due before class starts one week after handed out (so you have one week to complete them). If you turn in the assignment late, I will grade it so you know how you did, but it will count as a 0.

You should email your homework solution to me at jkinne@cs.indstate.edu before it is due.

For each assignment, please download the template file by right-clicking on the the link, and choosing "Save target as" or "Save link as" (or whatever your browser calls it). You should put your answers into this file, and then turn in this file as an attachment to an email. If you do not turn in your homework in this way, I will take off points!

Project 1

For the second project, you can choose between three projects to complete - GIS image analysis, game-playing rock-paper-scissors, and speech recognition. There will be a halfway checkpoint that is graded. The halfway checkpoint will be due Nov 29. The final version of the project will be due Dec 15. Details about each project choice are below.

I will give you little, if any, code to start with. You will be doing these yourself. Before the final result is turned in, we will establish input/output formats or class/function names that must be used (for use in testing, and also for playing the games against each other). But we will wait until later to specify this. Also, note that both the halfway checkpoint and final result will be graded according to what the syllabus says about grading programs. Go and look at that again.

GIS Image Analysis.

  1. Halfway checkpoint. For the halfway checkpoint, you should have a working neural net solving the problem described in the paper Estimating impervious surfaces from medium spatial resolution imagery using the self-organizing map and multi-layer perceptron neural networks by Xuefei Hu and Qihao Weng. To begin with, you will need to have a working multi-layer neural net with learning. This includes both feed-forward and back propagation. We did the feed-forward code in class, though we did not test it to make sure it was correct. You will need to do the back propagation part. Once we get the satellite data, you should plug the data into your neural net to duplicate the results of the paper that have to do with neural nets.

    UPDATE: I don't have the satellite data for you yet. Instead, we will have as our intermediate goal to do handwriting recognition. This is because that data is easily available, from http://yann.lecun.com/exdb/mnist/. Once you get your back-prop working, it will not be too much work to get it working on the handwriting recognition problem. To help you with getting the data into your program, see mnistHandwriting.py. That file has the function that I used to load the data into my program and put it into my back propagation algorithm. You can use that function in your code too, so once you have your back propagation working, you can download the files from the website listed above and be ready to go!

    UPDATE 2: We have the image data available for download. Download code/GISdata.txt that has the data in it. And you can look at code/GIShandwriting.py for the function that extracts the pixel training information from the data file. Note: the GISdata.txt file is not a text file, I don't know why I named it .txt. You'll have to use the GIShandwriting.py file to read information from the GISdata.txt file.

  2. Final result. Once you have duplicated the results, we will discuss what potential improvements could be made - trying different neural net configurations, using information about the neighbors of pixels, using something other than neural nets, etc. You will try out some of these and report on the results.

    Alternatively, for the final result, if you want to you can discuss with me other things to do (e.g., do a character recognition program, or something else related to image analysis).

Game Playing.

On a preliminary basis, the game we will use is Rock-Paper-Scissors. This simple game turns out to be fairly interesting. If a computer plays completely at random, then it will essentially tie any human on average. But, humans are terrible at being random, so a computer should be able to do better than just tie the human. Furthermore, if a computer strategy is not allowed to use randomness, then how well can it do?

We will decide for certain in class on Nov 15 what game to use. If you have other suggestions, bring your suggestion and reasons to class.

Yep, we decided rock-paper-scissors is fine. And, here is a link to one computer RPS player (that is not "too" hard to tie or beat, after all...)

  1. Halfway checkpoint. You have the game working, which can be played computer versus human, human versus human, or computer versus computer. The game should keep a log file of all matches (that way we can potentially use this information as training data). You should have the computer stategy that just guesses at random, and you should have one other computer strategy already working. Your goal in this part is to have a computer strategy that beats humans better than random guessing. Your strategy may use randomness.
  2. Final result. For the final part, add in a computer strategy that does not use any randomness. One way to think of such a stategy is as a finite state machine that selects the next choice based on what happened in the previous BLANK many games (say, 20). It should be the case that having a larger memory will result in a better computer player. So you should have a random-less strategy working, that can be given a parameter about how much in the past it remembers. How much does the amount of memory matter? How well does this strategy do against humans? How well against other computer strategies?

Speech Recognition.

The goal of this project is to do some simple voice recognition. We can decide as a class precisely what voice commands we aim to recognize. It will be some relatively small set. Note that this project has a lot more components involved than the other two, including learning some new material.

  1. Halfway checkpoint. Follow the strategy outlined in the textbook for voice recognition. This includes taking the audio signal, breaking it up into frames, applying the the FFT to extract features out of the audio signal. Ultimately, for each frame you have a small number of features (the book uses 39). Then those get put into a hidden markov model. So, do all of this, so that it works for a speech model based on only two sounds. Let's use the sound of "long a" and "hard t-". So the audio stream would be a sequence of just those two sounds. By the halfway point, you should have all of the pieces in place to analyze the audio sequence and output the transcript of "long a" and "t-" sounds.

    Besides using numpy to do the FFT for us, I think we can do all of that from scratch ourselves, without using other libraries.

  2. Final result. For the final result, we will add more sounds to the speech model, and potentially do something with pulling words out of the sounds. We will discuss this after the halfway checkpoint is done.

Exam 1

The final exam is cumulative. It will be of the same kind of format as the first exam. Since the first exam, we know more about neural nets and perceptrons than before, and have done probability and bayesian nets.

Topics

Exam 0

The first exam covers the part of the course up through neural nets, and just the introduction of support vector machines. That includes what we talked about in class, the first three homeworks, and the first project. Below are a list of topics and a sample exam that is fairly similar to what the real one will be like.

Topics

Sample Exam

Here are some sample questions. The real exam will have the same types of questions and be about the same length.

True or False. 2 points each. For each, indicate true or false and give a short explanation why.

  1. Depth first search is always faster than breadth first search.
  2. Consider the 8 queens problem. The following is an admissable heuristic: the number of pairs of queens that are not attacking each other.
  3. Consider a CSP that has 20 variables that can each be one of three different values (for example, this could be a graph with 20 different nodes that we want to 3-color). Then solving the CSP with constraint propogation will take at most 20*20 steps.
  4. Consider using min-max to play chess, where the game-tree is explored to depth 10; at depth 10, the search is cut off and a heuristic value is returned. Using min-max in this way will play chess optimally.
  5. Consider trying to learn a Boolean function on 100 variables. Then if we start with a training set of 100*100 correct examples, we can always come up with a correct classification on all possible inputs if we use the right learning algorithm.

Trace the example. 3 points each. For each, show the steps of running the indicated algorithm on the given data.

  1. Iterative deepening DFS. For the maze at this link, label the intersections as nodes with connections between nodes if one intersection leads directly to the other. Show the use of iterative deepening DFS to find the way through this maze. You should include the list of nodes that are the "current node".
  2. A*. Consider the following 8-puzzle.
    1 4 2
    3 5 _
    6 7 8
    The goal state for this puzzle is
    _ 1 2
    3 4 5
    6 7 8
    Use A* search with the Manhattan distance heuristic to solve the puzzle. Remember the Manhattan distance heuristic is the sum of the distances of all tiles from where the should be in the goal state, where the distance is the Manhattan or city-block distance. Show your work, so at each step show what the neighbor states are that are considered, what their heuristic values are, and which is chosen to explore next.
  3. CSP with constraint propagation. Consider the following Sudoku puzzle:
    1 _ 3 4
    _ 3 _ _
    2 _ _ _
    _ _ _ _
    We formulate this as a CSP by letting the blanks each be variables with domain {1,2,3,4} and having constraints for "the Sudoku rules" (each pair of variables in the same row or column or 2x2 block must have different values). Solve the puzzle using CSP search with constraint propagation. Show your work.
  4. Alpha-Beta. Show alpha-beta pruning on this search tree. That tree already shows which search paths were pruned, that won't be the case on the test. You should show the values of alpha and beta as you traverse the tree, and do it depth-first, expanding on the left first.
  5. Neural Net. Suppose we have a perceptron with 5 input bits and with weights 1.8, 1.6, 2.3, 2.7, and 2.8 on the input bits, and weight 3.5 on the "fixed -1" that goes into the perceptron. Choose four different values for the inputs, and show what value the perceptron computes. Assume the perceptron is a threshold perceptron that outputs 1 iff the total weight is >= 0.

    Note that on the test I will likely have you do a two layer neural net.

Explain/essay. 5 points each.

  1. Choose two uninformed search strategies and explain the benefits and detriments of each (in terms of time, memory space, optimality, completeness, and anything else that seems important), and say which you would use in general and why.
  2. Consider trying to solve the 3-coloring problem. Given a graph, try to find a way to color the nodes with only three colors total (red, green, and blue) so that every node is a different color than all of its neighbors. Formulate the problem as a CSP, and give pseudocode to solve it using a CSP search that is depth-first and uses constraint propagaiont.
  3. Pick a local search algorithm, give pseudocode for the algorithm, and explain its running time, memory space usage, and whether it is complete and/or optimal.

Project 0

For the first project, you can choose between completing the connect4 game-play project or the Boe Bot maze project. If you choose to complete both projects, I will take the average grade from both and add approximately one letter grade -- so two B's adds up to one A, two C's add up to one B, etc.

For each project, you have a series of checkins before the completed project is due. I will not grade the checkins, but I will give you feedback as to the correctness. You should plan on completing the checkins on time if you are going to be able to complete the project on time. And here are the checkins for each project...

Connect4 Project

Boe Bot Project

Possible assignments/projects

(Just FYI, and for me to remember...)