Difference between revisions of "GA Interview Questions"
(→Spring 2019) |
(→Covid-19) |
||
Line 47: | Line 47: | ||
* Output format: for each date (most recent first) the countries with the top 10 highest # of deaths, including the country name and number of deaths on that date | * Output format: for each date (most recent first) the countries with the top 10 highest # of deaths, including the country name and number of deaths on that date | ||
* Partial credit: (a) csv file in similar format as input but with totals per country rather than state/provice, or (b) for each date, print the top 10 entries on that date (could be a state/province or a country) | * Partial credit: (a) csv file in similar format as input but with totals per country rather than state/provice, or (b) for each date, print the top 10 entries on that date (could be a state/province or a country) | ||
− | * Difficulty rating: 2 | + | * Difficulty rating: 2 |
==Genes== | ==Genes== |
Revision as of 00:48, 24 March 2020
This page contains interview questions that have been used for deciding on who gets GA positions in the department. For each problem, you should produce (a) a description of how your solutions that works, and (b) a program that solves the problem correctly. You should put the description as a comment at the top of your code.
You should also include at the top of each solution a list of anything you used as a reference for your solution; if we find that you used something and did not cite it there is roughly a 0% chance we would award you a GA position.
You should use a programming language that is supported on the CS server. We will compile and run your solutions, so make sure they work. Supported programming languages include - C, C++, Python, Java, R, php, javascript, Octave. If you would like to use a different programming language, please check with us first. For submitting your solutions, you should only submit your source files (unless otherwise stated in the problem).
To submit your solutions, attach your solutions and reply by email to the associate chairperson of CS. Good luck!
Contents
Scoring
Each problem will be rated as correct, half correct, or incorrect; this value (1, 1/2, or 0) is multiplied by the difficulty rating for the problem. Thus for the spring 2019 questions, the maximum possible score is 11.5. Note that we use your total score as part of our decision process, but it is not the only factor. We also take into account performance in your courses, Skype interviews, etc.
Spring 2020
The following are the interview questions to be solved for those being interviewed in the spring of 2020.
Graph Levels
Write a program to get an input digraph G and decide whether its vertices can be partitioned into k parts say V_1,V_2,...,V_k such that every arc of G goes from some V_i to V_{i+1}, 1 <= i <= k. The output of the program should be the partite sets if the answer is yes, otherwise, print NO.
- Solution filename: graph-levels.* (c, py, java, etc.)
- Input format: adjacency list
- Partial credit:
- Full credit: output either "not possible" or the smallest k such that the graph can be partitioned as described
- Difficulty rating: 2
Product Sum
Let a_1, a_2,..., a_{2020} be positive real numbers such that their product is 2^{-2020}. We sum these numbers to get a value x. What is the minimum value of x and what is the maximum value of x? Your solution should include an argument/proof for why your answers are correct.
- Solution filename: product-sum.* (txt, docx, pdf, etc.)
- Input format: NA
- Partial credit: correct value and argument for either min or max
- Full credit: correct values and arguments for both min and max
- Difficulty rating: 2
Exponential
We divide the number (1+2+3+...2020)^{1^2+2^2+...+2020^2} by 7. What is the remainder? Your solution should include an argument/proof for why your answer is correct.
- Solution filename: exponential.* (txt, docx, pdf, etc.)
- Input format: NA
- Partial credit: correct solution with justification but not a formal proof
- Full credit: correct answer and formal proof of correctness
- Difficulty rating: 2
Covid-19
Given covid-19 deaths data compute for each date the countries with the top 10 highest number of new deaths for that date.
- Solution filename: covid-19.* (c, py, java, etc.)
- Input format: csv file linked above
- Output format: for each date (most recent first) the countries with the top 10 highest # of deaths, including the country name and number of deaths on that date
- Partial credit: (a) csv file in similar format as input but with totals per country rather than state/provice, or (b) for each date, print the top 10 entries on that date (could be a state/province or a country)
- Difficulty rating: 2
Genes
Given Gene transcripts print out the percentage of base pairs that are A, T, C, G overall and as the first, second, last, and second-last in a transcript. Note that you should ignore any transcripts that are "transcript variant 2", variant 3, etc.
- Solution filename: genes.* (c, py, java, etc.)
- Input format: text file linked above (needs to be unzipped)
- Output format: percentages listed above
- Partial credit: one or more of the requested percentages is correct
- Difficulty rating: 2.5
Indy Weather
Given Indianapolis weather data determine the average temperature for each year and whether it is higher than the average over all years prior to it, and print out the percentage of years that are warmer than the average of previous years.
- Solution filename: indy-weather.* (c, py, java, etc.)
- Input format: csv file linked above
- Output format: percentage of years that have an average temperature higher than the average of previous years, also print the average temperature for each year
- Partial credit: compute and print the average temperature for each year
- Difficulty rating: 2.5
Spring 2019
The following are the interview questions to be solved for those being interviewed in the spring of 2019.
Guess the Op
Given three integers, your program determines which operations could have resulted in the third integer from the first two. For example, for the integers 5, 3, 2, the operation could have been subtraction. For the integers 5, 4, 1 the operation could have been either subtraction or xor. Your program should output all operations that could have worked, from the following set of operations: + - * / % ^ & | << >>. Note that by % we mean remainder/mod, and the following are bit operations: ^ & | >> <<.
- Solution filename: guess-op.* (c, py, java, etc.)
- Input format: three integers, read from standard input
- Partial credit: not likely
- Full credit: output all operations that could have resulted in the third integer, or "none" if there aren't any.
- Difficulty rating: 0.5
Data Saver
For this problem your program should save on data space by determining the most frequent letter in an input text and then output the text with that letter removed.
- Solution filename: data-saver.* (c, py, java, etc.)
- Input format: your program should read its input from standard input, and should continue reading until end of file
- Partial credit: determine the most frequent letter and output it
- Full credit: output should only be - the input text but with the most frequent letter removed. Note that this likely requires saving all of the text until the end of input, and at that point deciding what the most frequent letter was.
- Difficulty rating: 1
Mod-7 Exponents
Determine the value of the expression (20182019 + 20192018) mod 7. Use whatever means you would like to determine the correct answer. Once you have the correct answer, write up a proof that the answer is correct. It is possible to compute the answer using a single piece of paper, if you use the right results from mathematics.
- Solution filename: mod7-exponent.* (txt, docx, pdf, etc.)
- Partial credit: Determine the correct answer and explain how you got it (if you used a program, then include the code for the program).
- Full credit: A proof that fits on one page and is easy enough to understand.
- Difficulty rating: 2
012 Graph Coloring
Write a program that reads a directed graph G and finds a labeling of its vertices with labels ("colors") 0, 1, 2 in such a way that for every arc xy of G, label(y) = label(x)+1 mod 3. Explain your approach first.
- Solution filename: 012-coloring.* (c, py, java, etc.)
- Input format: your program should read a directed graph as an adjacency matrix or adjacency list.
- Partial credit: not likely
- Full credit: correct solution, output "yes" or "no" for the graph
- Additional files: When submitting your solution, you should also attach text files for a few of the test graphs that you tested.
- Difficulty rating: 2
10 Smallest
Your program should output the 10th smallest integer from a given input sequence.
- Solution filename: 10-smallest.* (c, py, java, etc.)
- Input format: sequence of at least 10 integers
- Partial credit: produce the correct answer, and a good explanation for your algorithm
- Full credit: algorithm that is, roughly speaking, as fast as possible
- Difficulty rating: 2
Diagonal Walks
Your program should determine how many valid diagonal walks are possible between two points on a grid, subject to certain rules. We imagine that you start by standing in a Euclidean plane at the origin, (0, 0). You want to walk to the point (k, 0). We assume k is an integer. Each step you take should be one unit closer to your destination on the x axis. Thus the x coordinate of your location is first 0, then 1, then 2, etc. You can choose your y coordinate to be either +1 or -1 from your current location, but your y coordinate should never be negative. From the origin, you only have one choice - to position (1, 1). From (1, 1) your next move could be to (2, 2) or (2, 0).
You must determine how many diagonal walks follow these rules and take you from (0, 0) to (k, 0).
- Solution filename: diagonal.* (c, py, java, etc.)
- Input format: a single integer k, the distance along the x axis of your walk
- Partial credit: produce correct answer, good explanation for your algorithm
- Full credit: produce all of the valid diagonal walks, and if an appropriate setting is given (via command line or otherwise) all of the walks are printed
- Difficulty rating: 3