# GA Inteview Questions

This page contains interview questions that have been used for deciding on who gets GA positions in the department. For each problem, you should produce (a) a description of how your solutions that works, and (b) a program that solves the problem correctly. You should put the description as a comment at the top of your code.

You should use a programming language that is supported on the CS server. We will compile and run your solutions, so make sure they work. Supported programming languages include - C, C++, Python, Java, R, php, javascript, Octave. If you would like to use a different programming language, please check with us first. For submitting your solutions, you should only submit your source files (unless otherwise stated in the problem).

## Contents

## Rules and Scoring

**Rules** Do not discuss the problems with anyone else (nobody else - not a fellow student, not a relative, not stack overflow, nobody). Do not try to find solutions to the problems online. You may use only - (a) texts/resources on algorithms, (b) documentation of programming languages and libraries/packages/modules. You should include at the top of each solution a list of anything you used as a reference for your solution; if we find that you used something and did not cite it there is roughly a 0% chance we would award you a GA position.

**Input/Output Formatting** If you do not follow the rules for input/output formatting as listed in the question descriptions your submission will likely be judged as incorrect or only partially correct. For example, for the Covid-19 problem in the Spring 2020 problem set, if you write a program that reads data from a different data source than listed in the problem, your submission would be judged as incorrect (for example you might find a data source that accomplishes some portion for you, which wouldn't be fair).

**Clarifications** You may write to the associate chairperson for clarifications, but these are not likely to be answered. An answer will only be given if you have found a mistake in the problem. For all other questions, your ability to understand the problem statement without asking for clarification is part of the competition process. So, feel free to write to the associate chairperson about the problems, and you can take a lack of response as an indication that you should be able to figure out the answer to your question just by reading the problem statement carefully.

**Scoring** Each problem will be rated as correct, half correct, or incorrect; this value (1, 1/2, or 0) is multiplied by the difficulty rating for the problem. Thus for the spring 2020 questions, the maximum possible score is 19. Note that we use your total score as part of our decision process, but it is not the only factor. We also take into account performance in your courses, Skype interviews, etc. Also note that it is much better to get a few problems 100% correct than to have "a start" on more of the problems.

# Spring 2021

Questions to be completed for the spring 2021 round of interviews are contained in a zip file: https://cs.indstate.edu/info/files/ga_questions_spring2021.zip

# Fall 2020

**Deadline** Submit by 11:59pm eastern US time on October 12.

**Submitting** If you are a current ISU student, submit on the CS server, and run `handin --checkout -c cs458 ga_interview_questions`

to checkout, and follow the instructions in that file. For prospective/incoming students, attach your solutions and reply by email to the associate chairperson of CS. Good luck!

*Note that these questions (check out using the instructions above) are a bit different than we have done in the past. Rather than testing whether you are a good problem solver and programmer on challenging problems, we are testing whether you can follow along and complete assignments for the courses you would be helping students with in the CS lab. Good luck.*

# Spring 2020

**Deadline** Submit by 11:59pm eastern US time on March 29.

**Submitting** If you are a current ISU student, submit on the CS server, and run `handin --checkout -c cs473 ga_interview_questions`

to checkout the contest, when you are done (read the instructions in the README file) you submit with `handin -c cs473 ga_interview_questions`

. For prospective/incoming students, attach your solutions and reply by email to the associate chairperson of CS. Good luck!

The following are the interview questions to be solved for those being interviewed in the spring of 2020.

## Graph Levels

Write a program to read an input digraph G and decide whether its vertices can be partitioned into k parts say V_1,V_2,...,V_k such that every arc of G goes from some V_i to V_{i+1}, 1 <= i <= k.

- Solution filename: graph-levels.* (c, py, java, etc.)
- Input format: adjacency list - number of vertices on first line, number of edges on second line, remaining lines indicate a directed edge u v.
- Partial credit: NA
- Full credit: output either "not possible" or the smallest k such that the graph can be partitioned as described
- Difficulty rating: 2

## Product Sum

Let a1, a2,..., a2020 be positive real numbers such that their product is 2^{-2020}. We sum these numbers to get a value x. What is the minimum value that x could be and what is the maximum value that x could be? Your solution should include an argument/proof for why your answers are correct.

- Solution filename: product-sum.* (txt, docx, pdf, etc.)
- Input format: NA
- Partial credit: correct value and argument for either min or max
- Full credit: correct values and arguments for both min and max
- Difficulty rating: 2

## Exponential

We divide the number (1+2+3+...2020)^{12+22+...+20202} by 7. What is the remainder? Your solution should include an argument/proof for why your answer is correct.

- Solution filename: exponential.* (txt, docx, pdf, etc.)
- Input format: NA
- Partial credit: correct answer with justification but not a formal proof
- Full credit: correct answer and formal proof of correctness
- Difficulty rating: 2

## Covid-19

Given covid-19 deaths data compute for each date the countries with the top 10 highest number of new deaths for that date.

- Solution filename: covid-19.* (c, py, java, etc.)
- Input format: csv file linked above
- Output format: for each date (most recent first) the countries with the top 10 highest # of deaths, including the country name and number of deaths on that date
- Partial credit: (a) csv file in similar format as input but with totals per country rather than state/provice, or (b) for each date, print the top 10 entries on that date (could be a state/province or a country)
- Difficulty rating: 2

## Genes

Given Gene transcripts print out the percentage of base pairs that are A, T, C, G overall and as the first, second, last, and second-last in a transcript. Note that you should ignore any transcripts that are "transcript variant 2", variant 3, etc. Note that your program should complete on our server in just a few seconds, and should complete your personal computer in less than, say, 1 minute.

- Solution filename: genes.* (c, py, java, etc.)
- Input format: text file linked above (needs to be unzipped)
- Output format: percentages listed above
- Partial credit: one or more of the requested percentages is correct
- Difficulty rating: 2

## Indy Weather

Given Indianapolis weather data determine the average temperature for each year and whether it is higher than the average over all years prior to it, and print out the percentage of years that are warmer than the average of previous years.

- Solution filename: indy-weather.* (c, py, java, etc.)
- Input format: csv file linked above
- Output format: percentage of years that have an average temperature higher than the average of previous years, also print the average temperature for each year
- Partial credit: compute and print the average temperature for each year
- Difficulty rating: 2

## Machines

We are given n1 jobs each with processing time 0.a, n2 jobs each with processing time 0.b, and n3 jobs each with processing time 0.c. How many machines do we need to schedule all the jobs such that within one unit of time all the jobs are finished?

- Solution filename: machines.* (c, py, java, etc.)
- Input format: six integers separated by whitespace - n1 n2 n3 a b c
- Output format: how many machines
- Partial credit: NA
- Difficulty rating: 3

## Shortcuts

You are testing a robot which can only walk in one direction (north), and will test the robot in a field that is marked in a grid at each meter. The robot will be placed at one of the grid marks and head north until it reaches the edge of the field. For some pairs of points, there is an underground shortcut (tunnel) between the two points. If the robot steps on either end of the shortcut, the robot goes through the shortcut and continues walking north from the other end of the shortcut.

Given the dimensions of the field and a list of S shortcuts (specified by the (x, y) coordinates of both endpoints of the shortcut), a question is whether there is a starting point for the robot that would result in it walking in an infinite loop. The following are examples of maps that could result in an infinite loop. On map1, if the robot starts in the lower left corner it will take the shortcut through a and exit through the top of the field (not an infinite loop). However, if the robot is placed in the spot in between the a shortcuts, the robot would enter an infinite loop. map2 demonstrates that an infinite loop could exist which involves going through a sequence of shortcuts before repeating (by starting in between the a and b points in the first column).

map1 map2 a..b b..a .... .... a..b a..b .... ....

You will be given the dimensions of the field and locations for endpoints of the shortcuts, but without knowing where each shortcut goes. For example, you might be given the following.

map3 .2.. 1... ...4 .3..

Your task is to determine how many different ways to connect the shortcut points would result in a potential infinite loop. For map3, the only pairing which would result in a possible infinite loop is: (2-3), (1-4).

- Solution filename: shortcuts.* (c, py, java, etc.)
- Input format: integers width, height the dimensions of the field on a single line, integer S the number of shortcut holes on its own line, S lines of coordinates x y for the coordinate of a shortcut hole (with the lower left corner being position 0 0). map3 would be presented as

4 4 4 0 2 1 3 1 0 3 1

- Output format: number of different complete pairings of shortcuts that could result in an infinite loop.
- Partial credit: Given the input as stated and a complete pairing of the shortcut holes, determine if the given pairing could result in an infinite loop.
- Difficulty rating: 4

# Spring 2019

The following are the interview questions to be solved for those being interviewed in the spring of 2019. If not otherwise specified, output should be to standard out.

## Guess the Op

Given three integers, your program determines which operations could have resulted in the third integer from the first two. For example, for the integers 5, 3, 2, the operation could have been subtraction. For the integers 5, 4, 1 the operation could have been either subtraction or xor. Your program should output all operations that could have worked, from the following set of operations: + - * / % ^ & | << >>. Note that by % we mean remainder/mod, and the following are bit operations: ^ & | >> <<.

- Solution filename: guess-op.* (c, py, java, etc.)
- Input format: three integers, read from standard input
- Partial credit: not likely
- Full credit: output all operations that could have resulted in the third integer, or "none" if there aren't any.
- Difficulty rating: 0.5

## Data Saver

For this problem your program should save on data space by determining the most frequent letter in an input text and then output the text with that letter removed.

- Solution filename: data-saver.* (c, py, java, etc.)
- Input format: your program should read its input from standard input, and should continue reading until end of file
- Partial credit: determine the most frequent letter and output it
- Full credit: output should only be - the input text but with the most frequent letter removed. Note that this likely requires saving all of the text until the end of input, and at that point deciding what the most frequent letter was. The program should work for inputs of up to 100 million characters.
- Difficulty rating: 1

## Mod-7 Exponents

Determine the value of the expression (2018^{2019} + 2019^{2018}) mod 7. Use whatever means you would like to determine the correct answer. Once you have the correct answer, write up a proof that the answer is correct. It is possible to compute the answer using a single piece of paper, if you use the right results from mathematics.

- Solution filename: mod7-exponent.* (txt, docx, pdf, etc.)
- Partial credit: Determine the correct answer and explain how you got it (if you used a program, then include the code for the program).
- Full credit: A proof that fits on one page and is easy enough to understand.
- Difficulty rating: 2

## 012 Graph Coloring

Write a program that reads a directed graph G and finds a labeling of its vertices with labels ("colors") 0, 1, 2 in such a way that for every arc xy of G, label(y) = label(x)+1 mod 3. Explain your approach first.

- Solution filename: 012-coloring.* (c, py, java, etc.)
- Input format: your program should read a directed graph as an adjacency matrix or adjacency list.
- Partial credit: not likely
- Full credit: correct solution, output "yes" or "no" for the graph
- Additional files: When submitting your solution, you should also attach text files for a few of the test graphs that you tested.
- Difficulty rating: 2

## 10 Smallest

Your program should output the 10th smallest integer from a given input sequence.

- Solution filename: 10-smallest.* (c, py, java, etc.)
- Input format: sequence of at least 10 integers
- Partial credit: produce the correct answer, and a good explanation for your algorithm
- Full credit: algorithm that is, roughly speaking, as fast as possible
- Difficulty rating: 2

## Diagonal Walks

Your program should determine how many valid diagonal walks are possible between two points on a grid, subject to certain rules. We imagine that you start by standing in a Euclidean plane at the origin, (0, 0). You want to walk to the point (k, 0). We assume k is an integer. Each step you take should be one unit closer to your destination on the x axis. Thus the x coordinate of your location is first 0, then 1, then 2, etc. You can choose your y coordinate to be either +1 or -1 from your current location, but your y coordinate should never be negative. From the origin, you only have one choice - to position (1, 1). From (1, 1) your next move could be to (2, 2) or (2, 0).

You must determine how many diagonal walks follow these rules and take you from (0, 0) to (k, 0).

- Solution filename: diagonal.* (c, py, java, etc.)
- Input format: a single integer k, the distance along the x axis of your walk
- Partial credit: produce correct answer, good explanation for your algorithm
- Full credit: produce all of the valid diagonal walks, and if an appropriate setting is given (via command line or otherwise) all of the walks are printed
- Difficulty rating: 3