Create a dataframe that only contains the WT columns and has log2 values (use the data.log2 dataframe as a starting point)
Create a dataframe with the average between the two replicates for each treatment. You should end up
with 4 columns of numbers: WT day 0, WT day 2, WT day 6, WT day 10.
Take your 4 columns and multiply each by the appropriate factor so that the median of each of the four columns
is exactly 10.
Create columns that are computed by taking your (WT day 2 - WT day 0), (WT day 2 - WT day 10) and save the
top 100 of each. Look up the names of your top 10 of each and save this.
At which time point do you think there is the best change (from day 0 to day 2, from day 2 to day 6,
or from day 6 to day 10)? Why?
Due: 24-May-2019 (by end of day)
Gene Expression, some more
Starting with Jeff's solution to the last assignment (files GSE69618.import.R and
GSE69618.analyze.R), work on the questions at the bottom of
GSE69618.analyze.R)
Take the data in
http://cs.indstate.edu/~jkinne/bd4isu-summer-2019/code/GSE51483_data.csv and analyze this
data as well. You should start with the files from GSE69618 and make changes to make it work
for GSE51483. Note that this dataset has three replicates for each type of sample.
For your analysis you can treat right and left ventricle samples as the same type of sample,
so 6 replicates of samples starting from day9.
You should look at the following time points: ESC, embryo day 7, heart days 8/9/12, adult heart. You
can remove the other time points from your analysis. You can also treat the 8/9/12 samples as
from one group so that you will have 4 averaged groups in your analysis.
Note: course website layout/code/template from Steve Baker.
Anything horrible is not his fault.