Assignments
[Note: this webpage last modified Friday, 09-Dec-2011 08:28:56 EST]
Homework assignments will be posted to this website.
Homework assignments
are due before class starts one week after handed out (so you
have one week to complete them). If you turn in the assignment late, I
will grade it so you know how you did, but it will count as a 0.
You should email your homework solution to me at
jkinne@cs.indstate.edu before it is due.
For each assignment, please download the template file by right-clicking on
the the link, and choosing
"Save target as" or "Save link as" (or whatever your browser calls it).
You should
put your answers into this file, and then turn in this file as
an attachment to an email. If you do not
turn in your homework in this way, I will take off points!
Project 1
For the second project, you can choose between three projects to complete -
GIS image analysis, game-playing rock-paper-scissors, and speech
recognition. There will be a halfway checkpoint that is graded. The
halfway checkpoint will be due Nov 29. The final version of the project
will be due Dec 15. Details about each project choice are below.
I will give you little, if any, code to start with. You will be doing these
yourself. Before the final result is turned in, we will establish
input/output formats or class/function names that must be used (for use
in testing, and also for playing the games against each other). But we will
wait until later to specify this. Also, note that both the halfway checkpoint
and final result will be graded according to what the syllabus says about
grading programs. Go and look at that again.
GIS Image Analysis.
- Halfway checkpoint. For the halfway checkpoint, you should have a working
neural net solving the problem described in the paper
Estimating impervious surfaces from medium spatial resolution imagery using the self-organizing map and multi-layer perceptron neural networks by
Xuefei Hu and Qihao Weng. To begin with, you will need to have a working
multi-layer neural net with learning. This includes both feed-forward and
back propagation. We did the feed-forward code in class, though we did
not test it to make sure it was correct. You will need to do the back
propagation part. Once we get the satellite data, you should plug the data
into your neural net to duplicate the results of the paper that have to
do with neural nets.
UPDATE: I don't have the satellite data for you yet. Instead, we will
have as our intermediate goal to do handwriting recognition. This is
because that data is easily available, from
http://yann.lecun.com/exdb/mnist/. Once you get your back-prop
working, it will not be too much work to get it working on the handwriting
recognition problem. To help you with getting the data into your program,
see mnistHandwriting.py. That
file has the function that I used to load the data into my program and
put it into my back propagation algorithm. You can use that function in
your code too, so once you have your back propagation working, you can
download the files from the website listed above and be ready to go!
UPDATE 2: We have the image data available for download. Download
code/GISdata.txt that has the data in it.
And you can look at code/GIShandwriting.py
for the function that extracts the pixel training information from the
data file. Note: the GISdata.txt file is not a text file, I don't know why
I named it .txt. You'll have to use the GIShandwriting.py file to
read information from the GISdata.txt file.
-
Final result. Once you have duplicated the results, we will discuss what
potential improvements could be made - trying different neural net
configurations, using information about the neighbors of pixels, using
something other than neural nets, etc. You will
try out some of these and report on the results.
Alternatively, for the final result, if you want to you can discuss with
me other things to do (e.g., do a character recognition program, or something
else related to image analysis).
Game Playing.
On a preliminary basis, the game we will use is Rock-Paper-Scissors. This
simple game turns out to be fairly interesting. If a computer plays
completely at random, then it will essentially tie any human on average.
But, humans are terrible at being random, so a computer should be able to do
better than just tie the human. Furthermore, if a computer strategy is
not allowed to use randomness, then how well can it do?
We will decide for certain in class on Nov 15 what game to use. If you have
other suggestions, bring your suggestion and reasons to class.
Yep, we decided rock-paper-scissors is fine. And,
here is a link to one computer RPS player (that is not "too" hard to
tie or beat, after all...)
- Halfway checkpoint. You have the game working, which can be played
computer versus human, human versus human, or computer versus computer. The
game should keep a log file of all matches (that way we can potentially use
this information as training data). You should have the computer stategy
that just guesses at random, and you should have one other computer strategy
already working. Your goal in this part is to have a computer strategy
that beats humans better than random guessing. Your strategy may use
randomness.
- Final result. For the final part, add in a computer strategy that does
not use any randomness. One way to think of such a stategy is as a
finite state machine that selects the next choice based on what happened in
the previous BLANK many games (say, 20). It should be the case that having
a larger memory will result in a better computer player. So you should have
a random-less strategy working, that can be given a parameter about how
much in the past it remembers. How much does the amount of memory matter?
How well does this strategy do against humans? How well against other
computer strategies?
Speech Recognition.
The goal of this project is to do some simple voice recognition. We can
decide as a class precisely what voice commands we aim to recognize. It will
be some relatively small set. Note that this project has a lot more
components involved than the other two, including learning some new material.
- Halfway checkpoint. Follow the strategy outlined in the textbook for
voice recognition. This includes taking the audio signal, breaking it up
into frames, applying the the FFT to extract features out of the audio
signal. Ultimately, for each frame you have a small number of features (the
book uses 39). Then those get put into a hidden markov model. So, do
all of this, so that it works for a speech model based on only two
sounds. Let's use the sound of "long a" and "hard t-". So the audio
stream would be a sequence of just those two sounds. By the halfway point,
you should have all of the pieces in place to analyze the audio sequence and
output the transcript of "long a" and "t-" sounds.
Besides using numpy to do the FFT for us, I think we can do all of that
from scratch ourselves, without using other libraries.
- Final result. For the final result, we will add more sounds to the
speech model, and potentially do something with pulling words out of the
sounds. We will discuss this after the halfway checkpoint is done.
Exam 1
The final exam is cumulative. It will be of the same kind of format as
the first exam. Since the first exam, we know more about neural nets and
perceptrons than before, and have done probability and bayesian nets.
Topics
- Everything from the first exam. Somewhere between 1/2-2/3 of the
final will be on material that was covered on the first exam as well.
Expect similar questions for those topics, especially questions that
are similar to ones you have seen by just slightly different (so you
need to think a little to get the right answer, rather than just
memorizing).
- Simple probability questions using counting, about coin flips, playing
cards, etc.
- Probability definitions/rules you should know: definition of
conditional probability, definition of independence, chain rule,
law of total probability, bayes' rule.
- What does the joint probability distribution mean and how many
probabilities need to be specified for it.
- Bayes' net - what things are independent of each other according to
the graph, what probabilities need to be specified to be able to
reproduce the full joint distribution, be able to compute conditional
probabilities based off a specified bayes' net.
- Bayes' net algorithms - variable enumeration and variable
elimination. Be able to trace through the algorithms and think about how
they behave in general.
- Perceptrons - be able to do a transformation to the data, think about
perceptron learning in the transformed space and use the resulting
perceptron to make predictions. Check the in-class code and play with
it to get familiar.
Exam 0
The first exam covers the part of the course up through neural nets, and just
the introduction of support vector machines. That includes what we talked
about in class, the first three homeworks, and the first project. Below are
a list of topics and a sample exam that is fairly similar to what the real one
will be like.
Topics
Uninformed search algorithms, including BFS, DFS, iterative deepening,
and uniform-cost search. For each, you should be able to execute the
algorithm on sample data (so review the examples where they do this in
the book), and should be able to explain their running time and
memory space usage, and explain why they are or are not optimal or
complete.
Heuristic search algorithms, including A* search. Admissable and
consistent heuristics (definition, and be able to say whether a given
heuristic is admissable or consistent). Optiminality, completeness,
running time, and memory space of A*. Executing A* on example
data.
Local search, including hill-climbing, random-restart hill-climbing, and
variants (simulated annealing).
Be able to apply the algorithm, give pseudocode for the algorithm,
analyse completeness and running time, memory space.
Constraint satisfaction problems. Be able to state a problem as a CSP.
Search on CSPs, including constraint propagation. Be able to apply the
algorithm to examples and analyse the properties of the algorithm.
Adversarial search, including min-max and alpha-beta pruning. Be able to
apply the algorithms on example data, analyze their time and memory
space, and explain why they make optimal choices if the whole game
space is explored. Be able to explain why alpha-beta pruning does
not work for non-zero-sum games and why it does work for zero-sum games.
Learning. The general framework of supervised learning - training data,
test data, cross validation, overfitting problem, learning
classification problems, ...
Neural networks. Perceptrons, including how they are defined and their
learning algorithm. What kinds of functions can they learn. Multiple-layer
neural networks, be able to decide their output on given data.
Running time of neural networks, what functions can a perceptron compute,
what functions can multiple-layer neural nets compute. Be able to give
a neural net for simple functions.
Support vector machines. For a standard SVM (with no non-linear
component), what does it do - find "best" linear separator. Be able to,
by hand on an example with a few data points, say what line the SVM would
come up with (the best one). Be able to apply a transformation to some
data that makes the data become linear.
For all of the different algorithms, be able to compare and contrast, and
give reasons why you would use one or another in a given situation.
Sample Exam
Here are some sample questions. The real exam will have the same types
of questions and be about the same length.
True or False. 2 points each. For each, indicate true or false and
give a short explanation why.
- Depth first search is always faster than breadth first search.
- Consider the 8 queens problem. The following is an admissable
heuristic: the number of pairs of queens that are not attacking
each other.
- Consider a CSP that has 20 variables that can each be one of
three different values (for example, this could be a graph with
20 different nodes that we want to 3-color). Then solving the CSP
with constraint propogation will take at most 20*20 steps.
- Consider using min-max to play chess, where the game-tree is explored
to depth 10; at depth 10, the search is cut off and a heuristic value is
returned. Using min-max in this way will play chess optimally.
- Consider trying to learn a Boolean function on 100 variables. Then
if we start with a training set of 100*100 correct examples, we can
always come up with a correct classification on all possible inputs
if we use the right learning algorithm.
Trace the example. 3 points each. For each, show the steps of
running the indicated algorithm on the given data.
- Iterative deepening DFS. For the maze at
this link, label the intersections as nodes with connections between
nodes if one intersection leads directly to the other. Show the use
of iterative deepening DFS to find the way through this maze. You should
include the list of nodes that are the "current node".
- A*. Consider the following 8-puzzle.
1 4 2
3 5 _
6 7 8
The goal state for this puzzle is
_ 1 2
3 4 5
6 7 8
Use A* search with the Manhattan distance heuristic to solve the puzzle.
Remember the Manhattan distance heuristic is the sum of the distances of all
tiles from where the should be in the goal state, where the distance is
the Manhattan or city-block distance. Show your work, so at each step show
what the neighbor states are that are considered, what their heuristic
values are, and which is chosen to explore next.
- CSP with constraint propagation. Consider the following Sudoku puzzle:
1 _ 3 4
_ 3 _ _
2 _ _ _
_ _ _ _
We formulate this as a CSP by letting the blanks each be variables with
domain {1,2,3,4} and having constraints for "the Sudoku rules" (each
pair of variables in the same row or column or 2x2 block must have
different values). Solve
the puzzle using CSP search with constraint propagation. Show your
work.
- Alpha-Beta. Show alpha-beta pruning on
this search tree. That tree already shows
which search paths were pruned, that won't be the case on the test.
You should show the values of alpha and beta as you traverse the
tree, and do it depth-first, expanding on the left first.
- Neural Net. Suppose we have a perceptron with 5 input bits and
with weights 1.8, 1.6, 2.3, 2.7, and 2.8 on the input bits, and
weight 3.5 on the "fixed -1" that goes into the perceptron. Choose four
different values for the inputs, and show what value the perceptron
computes. Assume the perceptron is a threshold perceptron that outputs
1 iff the total weight is >= 0.
Note that on the test I will likely have you do a two layer
neural net.
Explain/essay. 5 points each.
- Choose two uninformed search strategies and explain the benefits
and detriments of each (in terms of time, memory space, optimality,
completeness, and anything else that seems important),
and say which you would use in general and why.
- Consider trying to solve the 3-coloring problem. Given a
graph, try to find a way to color the nodes with only three colors total
(red, green, and blue) so that every node is a different color than
all of its neighbors. Formulate the problem as a CSP, and give pseudocode
to solve it using a CSP search that is depth-first and uses
constraint propagaiont.
- Pick a local search algorithm, give pseudocode for the algorithm, and
explain its running time, memory space usage, and whether it is
complete and/or optimal.
Project 0
For the first project, you can choose between completing the connect4
game-play project or the Boe Bot maze project. If you choose to complete
both projects, I will take the average grade from both and add approximately
one letter grade -- so two B's adds up to one A, two C's add up to one B, etc.
For each project, you have a series of checkins before the completed
project is due. I will not grade the checkins, but I will give you feedback
as to the correctness. You should plan on completing the checkins on time
if you are going to be able to complete the project on time.
And here are the checkins for each project...
Connect4 Project
- Project files to start with:
connect4.py,
connect4_match.py. You should
completely read through the comments and code in both files before
starting. If you want, you can choose to deviate from the structure
that I have setup in connect4.py, but your program must be runnable
by connect4_match; see that file for more information.
- Due: Sunday, Oct 2
Computing gameState correctly in placeToken so that gameDraw and gameWon
are correct. This means the game will correctly tell you when someone
has won or if the game is a tie.
- Due: Saturday, Oct 8
Basic alpha-beta pruning done. For this, you need to fill out the
heuristic function, minValue and maxValue functions, and make any
changes necessary to the computerMove function.
-
Final due date: Tuesday, Oct 11.
Boe Bot Project
- Turn-in by 4pm 10/12: send your code, what is your setup (including e.g.
lighting conditions), best test-run, what
happened in class, comments out hardware, etc.
- Due: Sunday, Oct 2
Boe Bot is assembled with IR emitters and sensors working. And you have
some basic program loaded and working - like the one I had in class,
or one to follow a black piece of tape.
- Due: Saturday, Oct 8
Boe Bot can follow a black piece of tape and follow the "right hand rule" -
anytime there is a junction/intersection, it will turn right if possible.
This algorithm will get the robot out of any maze where all the tape is
connected (there are no island walls, search for maze right hand
rule for more on that).
- Final due date: Tuesday, Oct 11.
Goal: can you do something better/smarter than the right-hand rule?
Something that works even if there are island walls?
Possible assignments/projects
(Just FYI, and for me to remember...)
- (See also ideas mentioned in the syllabus)
- TSP - using A* with MST heuristic (and other TSP...).
- Any NP-complete problem using search algorithms or ...
- Something with relevance to ISU or Math/CS dept.?
- Route-planning/maze navigation with robots (implement DFS or others).
- Rather than doing any algorithm, reduce to SAT and use existing
SAT solvers. How well does that perform?
- SAT solver
- Theorem prover
- Expert system
- Speech recognition
- Image recognition
- Watson
- Evolvability
- Robots
- "Therapist", Turing test
- WordNet
- NP-complete/hard, exact vs. approximation vs. heuristic
- Turing test philosophic discussions - quantum, ...