BD4ISU - Biomedical Big Data for ISU

Summer 2019


  1. Logging into the CS systems
    • Complete the "First task", "Second task", "Third task", "Fourth task" as listed under May 22 on the course homepage.
    Due: 22-May-2019 (by end of day)
  2. Linux bootcamp Part 1 Due: 22-May-2019 (by end of day)
  3. Linux bootcamp Part 2 Due: 22-May-2019 (by end of day)
  4. New Basics Quizzes
    • Coming soon
    Due: 23-May-2019 (by end of day)
  5. Gene Expression
    • Do the following in the GSE69618.R file...
    • Create a dataframe that only contains the WT columns and has log2 values (use the data.log2 dataframe as a starting point)
    • Create a dataframe with the average between the two replicates for each treatment. You should end up with 4 columns of numbers: WT day 0, WT day 2, WT day 6, WT day 10.
    • Take your 4 columns and multiply each by the appropriate factor so that the median of each of the four columns is exactly 10.
    • Create columns that are computed by taking your (WT day 2 - WT day 0), (WT day 2 - WT day 10) and save the top 100 of each. Look up the names of your top 10 of each and save this.
    • At which time point do you think there is the best change (from day 0 to day 2, from day 2 to day 6, or from day 6 to day 10)? Why?
    Due: 24-May-2019 (by end of day)
  6. Gene Expression, some more
    • Starting with Jeff's solution to the last assignment (files GSE69618.import.R and GSE69618.analyze.R), work on the questions at the bottom of GSE69618.analyze.R)
    • Take the data in http://cs.indstate.edu/~jkinne/bd4isu-summer-2019/code/GSE51483_data.csv and analyze this data as well. You should start with the files from GSE69618 and make changes to make it work for GSE51483. Note that this dataset has three replicates for each type of sample. For your analysis you can treat right and left ventricle samples as the same type of sample, so 6 replicates of samples starting from day9. You should look at the following time points: ESC, embryo day 7, heart days 8/9/12, adult heart. You can remove the other time points from your analysis. You can also treat the 8/9/12 samples as from one group so that you will have 4 averaged groups in your analysis.
Note: course website layout/code/template from Steve Baker. Anything horrible is not his fault.