File: hw2.txt

Name: 

Collaborators:

Due date: August 28, 2012, 11:59pm

To handin: Create a folder in your CS account called 
  cs457.  Inside of that directory create a directory called
  handin.  Leave your completed hw2.txt in your handin directory.
  Make sure not to change it after the due date - I will check 
  the "last modified" time on the file.  

File permissions: make sure that your handin directory is not 
  publicly readable.  Use "chmod og-rx handin" and 
  "chomd og-rx handin/*" to make the directory and files
  so they are not readable by others.  I can still read your files
  because I have admin/root access.

Task: Complete the problems below.

Grading: 5 points for each problem.  Partial credit will be given.

  For problem 1, be as detailed as possible.  It would be nice if 
  based on your description I could go and find the data and start 
  putting it into a database or getting information out of it right away.
  It also should be some kind of interesting data.

  For problem 2, you can choose whatever data structure you want, but 
  your answers to (i), (ii), (iii) should be correct and justified.


* Problem 1 *
Find some data online that would be interesting for us to 
analyze with SQL queries.  You must give the following information: 
link, basic description, format the data is in, idea of how to get 
the data into SQL.

Example that would get a "reasonable" grade (not great, but okay):

link: http://cas.sdss.org/astrodr7/en/tools/search/sql.asp

basic description: The data is from the Sloan Digital Sky Survey (SDSS.
  SDSS has taken telescopic measurements (brightness of objects in the sky 
  at different wavelengths) for large portions of the sky.  For example, we could
  query the SDSS data to find the number of stars that have been surveyed that
  have a certain brightness at different wavelengths.  This might tell us something
  about the number of blue giant stars in the galaxy.  Woo hoo.

data format: easy - it is stored on an SQL database that we can query.  The 
  data is also available as a download from NASA in some sort of raw format.  I 
  have not looked up what that format is since we would be able to get started 
  with the SQL interface.

getting data into SQL: The SQL interface is good enough the way it is.  But if 
  we wanted to do more intensive queries, they probably would not let us.  To 
  do that we would need to download the raw data and create our own database.
  I have not looked into how to do that yet.


* Problem 2 *

Suppose we are going to store the following data and will be making
the following queries to the data.

Data: name, 991 number, major, GPA of all ISU students (current and past).

Queries: (a) look-up based on 991 number
         (b) look-up students with the highest GPA
         (c) insert new student
         (d) delete student

Choose a data structure for how you would store this data.  Do not say "database".
Your choices are the data structures you have studied in other CS classes.
Good candidates are: hash table, binary search tree, binary heap, and things 
like those.

Given your choice, explain the following.

(i) What do you use for the key in your data structure?
(ii) What is the big-O running time of each of the queries listed above?
(iii) What is the big-O amount of space needed to store your data structure?

For (ii) and (iii) assume that comparisons take a constant amount of time and 
that each student record takes a constant amount of memory.  Assume there are 
n students total.