[試題] 102上 蔡欣穆 演算法設計與分析 期中考

作者andyyuan ()

看板NTU-Exam

標題[試題] 102上蔡欣穆演算法設計與分析期中考

時間Sun Jan 12 20:59:36 2014

課程名稱︰演算法設計與分析課程性質︰資工系大二必修課程教師︰蔡欣穆開課學院：電機資訊學院開課系所︰資訊工程學系考試日期（年月日）︰2013/11/7 考試時限（分鐘）：180 minutes 是否需發放獎勵金：是（如未明確表示，則不予發放）試題 : 132 points in total Problem 1. In each of the following question, please specify if the statement is true or false. If the statement is true, explain why it is true. If it is false, explain what the correct answer is and why. (16 points. For each question, 1 point for the true/false answer and 2 points for the explanations) 1. The bug report is usually closed by the person who fixes the bug. 2. The class of NP is the class of languages that cannot be accepted in polynomial time. 3. Different encodings would cause different time complexity for the same algorithm. 4. " A language L is accepted by an algorithm A " means that, for some x !∈ L, A would reject x. Problem 2. " Short answer " questions : (21 points, 3 points for each question except question 2.6) 1. Why is it necessary for the software tester to minimize the number of steps to generate a bug ? 2. What are the 3 things that need to be in a bug report in a bug tracking system ? 3. Explain what " polynomial larger " means/ 4. Give an example of recurrences that is in the form of T(n) = aT(n/b) + f(n), but cannot be solved with Master theorem. 5. Explain what would happen if a dynamic programming algorithm is designed to solve a problem that does not have overlapping sub-problems. 6. Derive a ternary Huffman code for the following frequencies of characters : {a:100, b:500, c:500, d:600, e:800, f:1000, g:1700, h:1800}. Ternary codes use ternary bits, i.e. {0,1,2}, instead of binary bits, in its codewords. Draw the decoding tree for the code you derive. (6 points) Problem 3. Making changes (8 points) Assume that there are n kinds of available coins. The denominations are 1 = a_1 < a_2 < ... < a_n dollars, respectively, and each a_i, 1 ≦ i ≦ n is an integer. Consider you are making changes for m dollars using the fewest number of coins. 1. Describe a greedy algorithm to make change when n = 4 and the denominations are 1, 5, 10, 50. Prove that your algorithm always tields an optimal solution (prove the greedy property). (4 points) 2. Give a set of denominations for which the greedy algorithm does not yield an optimal solution to some m. Give an example of m to explain. (4 points) Problem 4. Copying Books (26 points) Before the invention of book-printing, it was very hard to make a copy of a book. All the contents had to be re-written by hand by so called scribers. The scribers had been given a book and after several months he finished its copy. One of the most famous scribers lived in the 15th century and his name was Xaverius Endricus Remius Ontius Xendruanus (Xerox). Anyway, the work was very annoying and boring. And the only way to speed it up was to hire more scribers. Once upon a time, there was a theater ensemble that wanted to play famous Antique Tragedies. The scripts of these plays were divided into many books and actors needed more copies of them, of course. So they hired many scribers to make copies of these books. Imagine you have m books (numbered 1,2,...,m) that may have different numbers of pages (p_1,p_2,...p_m) and you want to make one copy of each of them. Your task is to divide these books among k scribes, k ≦ m. Each book can be assigned to a single scriber only, and every scriber must get a continuous sequence of books. That means, there exists an increasing succession of numbers 0 = b_0 ≦ b_1 ≦ b_2 ≦ ... ≦ b_(k-1) ≦ b_k = m such that i-th scriber gets a sequence of books with number between b_(i+1) and b_i. The time needed to make a copy of all the books is determined by the scriber who was assigned the most work. Therefore, our goal is to minimize the maximum number of pages assigned to a single scriber. Your task is to find the optimal assignment. Answer the following related questions : 1. Define the sub-problem so that we can solve the problem with dynamic programming. Please clearly mark the parameters of the sub-problems and what each parameter represents. (3 points) 2. In this problem, the cost function that we need to optimize is the maximum number of pages assigned to a single scriber. Using your sub-problem definition, write down the recurrences that determine the value of the cost function based on the ones of smaller related sub-problem. Please also write down the boundary condition(s) of the cost function. (5 points) 3. Using a bottom-up style, please write the pseudo code or the C code of your algorithm to solve the problem in O(mk)-time. (5 points) 4. Please analyze the time complexity of your algorithm and show that it is indeed O(mk). (3 points) 5. Please prove this problem exhibits optimal substructure. Note that the proof will be in a slightly different format than the ones that we talked about in the lectures. The proof should state that optimal solution(s) to sub-problem(s) can be used to put together the optimal solution to the problem (the big problem). You should start by assuming that you can find the optimal solution(s) to the sub-problem(s), and use it(them) to lead to a solution to the big problem (we usually do it a opposite way). Then, assume that you can find a better solution than this solution to the big problem, and try to find a contradiction which proves that this assumption is false. Then the proof is completed. (10 points) Problem 5. Revisiting the Problem of Finding the Closest Pair of Points in 2D Space (15 points) In the third Devide-and-Conquer lecture, we talked about how to solve the problem with O(nlogn)-time. Assume the original set of points is set P. The divide step in the original algorithm presented in the lecture splits P into P_L and P_R and is as the following : a. Sort the points with their X coordinates. Also sort the points with Y coordinates. The result will be two sorted sequences of points P. b. The first \lceil |P|/2 \rceil -th point in the X-sorted sequence be x_c. c. The first \lceil |P|/2 \rceil points in the X-sorted sequence of points in P will be in P_L. And P_R = P - P_L. We would now like to change c. and d. of the divide step to become the following (called the new algorithm in the following) : c. P_L = {for every p ∈ P, x(p) ≦ x_c}, and P_R = P - P_L. x(p) is the X coordinate of point p. d. The line L that separates P_L and P_R is still x = x_c. In this case, only points in P_L are possible to be on line L. The conquer step is the same for the original algorithm and the new one with modified divide step : a. Respectively split the X-sorted and Y-sorted sequences into two sub-sequences according to the membership of points in P_L and P_R. b. Obtain the closest pairs of points in P_L and P_R, respectively, with recursive function calls. The split sorted sub-sequences are given to the function calls solving the sub-problem. Assume the distances between the closest pairs of points in P_L and P_R are δ_L and δ_R, respectively, and δ = min(δ_L,δ_R). The following questions are related to the combine steps of both the original algorithm and the new algorithm : 1. In order to keep achieve an O(nlogn)-time complexity, what is the maximum time complexity that the combine step can have ? Please use the recurrence to explain. (3 points) 2. In the combine step of the original algorithm, we remove the points in {for every p ∈ P, |x(p) - x_c| > δ} in the Y-sorted sequence. Then, we only pair each point to the next N_0 points in the sequence and check if the distance between the pair of tpoints is smaller than δ. What is the value of N_0 ? (1 points) Explain why it is not necessary to check the point after the next N_0 points in the sequence. (3 points) 3. With the new algorithm (with the mdified divide step) we can follow the same rationale, but now we can instead pair each point to the next N_m points. What is the value of N_m ? (1 points) Similarly, please explain the reason in detail. (3 points) 4. The two algorithm have the same time complexity - O(nlogn). In your opinion, which algorithm would run faster in reality ? To explain, please list the differences between the two algorithms in the divide step and in the combine step and compare their running time. (4 points) Problem 6. Solving the recurrences (20 points) Please use the recurrence tree method to solve the recurrence T(n) = √n T(√n) + n. The solution needs to be in the Θ-notation, and thus you need to do the following : 1. Draw the recurrence tree and use it to estimate the solution. The solution is in the form of T(n) = Θ(f(n)) (8 points) 2. Use the substitution method to prove the upper bound. In other words, prove that T(n) = O(f(n)). (6 points) 3. Use the substitution method to prove the lower bound. In other words, prove that T(n) = Ω(f(n)). (6 points) Problem 7. Maximum Sub-array, the Dynamic Programming Way (20 points) In the Divide-and-Conquer lectures, we talked about how to derive an O(nlogn)-time algorithm to solve the problem. As we stated in the lecture, this is actually not the fastest algorithm to solve the problem. Dynamic programming actually can be used to solve the problem in linear time. Let's re-state the problem as follows. You are given a sequence of n numbers {a_1, a_2, ..., a_n} in an array. You are asked to find the sum of the maximum sub-array. A sub-array is given by {a_i, a_(i+1), ..., a_j}, 1 ≦ i,j ≦ n. Note that 0, the sum of an empty sub-array, i.e., j < i, is allowed to be given as the answer. For example, the case could happen when all values in the array are negative. To give you easier time, we will define the sub-problem for you, given as the following. The sub-problem S_i, 1 ≦ i ≦ n, is to defined as the one to find the maximum sub-array that starts from anywhere but ends on a_i. 1. Prove that, with the given definition, the problem exhibits optimal substructure. (4 points) 2. Use the recurrences to give the cost function that represents the sum of the maximum sub-array of the given sub-problem. Note that in different conditions, the function is given in different ways. Ans you also need to consider and give the boundary cases. (6 points) 3. Assume that you are given an array as the following : {1,-4,3,2,-1,3,5,-4}. As an example, fill out the cost function table to slove all sub-problems for the given array. Then, explain and show how you can use the results to obtain the solution of the original problem (maximum sub-array of the original array). (4 points) 4. Instead of outputting the maximum value of the sum of the sub-array, you are now asked to give the indices of the starting and ending elements of the maximum sub-array. Please write down the pseudo code or C code to output these two numbers. (6 points) Problem 8. I have the tradition of letting the students write down some feedbacks about the course in the exam and I would like to continue this tradition. Please write down 3 things you like about this course and 3 things that you would like to see some changes (and your suggestion about how we should change them). (6 points) -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 140.112.243.26