精華區beta NTU-Exam 關於我們 聯絡資訊
課程名稱︰課程名稱︰ 生物序列分析演算法 課程性質︰ 演算法 課程教師︰ 趙坤茂 開課學院: 電資 開課系所︰ 資工 考試日期(年月日)︰ 2010.11.04 考試時限(分鐘): 3H 是否需發放獎勵金: yes 試題: Problem 1 (15%): Suppose we are given a very long DNA sequence where the occurrence probabilities of nucleotides A (adenine), C (cytosine), G (guanine), T (thymine) are 0.1, 0.3, 0.4, and 0.2, respectively. (a) (10%): Construct a Huffman code for them. You should work out the binary tree construction as well as the code assignment. (b) (5%): By the above Huffman coding scheme, what is the binary string for a 10-nucleotide DNA sequence “GGGCTTCACG.” Problem 2 (15%): In class, we introduced an O(n log n)-time algorithm for finding a longest increasing subsequence. Use h8; 2; 6; 4; 5; 7; 3; 1; 12; 9; 10i to explain how the algorithm works. Problem 3 (10%): Given a sequence of real numbers A = ha1; a2; : : : ; ani, the maximum-sum segment problem is to find a consecutive subsequence, i.e., a substring or segment, in A with the maximum sum. Let prefix sum P[i] = Pi j=1 aj be the sum of the first i elements. Explain how to use the prefix sum to deliver the maximum-sum segment in O(n) time. In the following, we are given two sequences A = ha1; a2; : : : ; ami and B = hb1; b2; : : : ; bni. An alignment of A and B is obtained by introducing dashes into the two sequences such that the lengths of the two resulting sequences are identical and no column contains two dashes. Let § denote the input symbol alphabet. A score ¾(a; b) is defined for each (a; b) 2 § £ §. The score of an alignment is the sum of ¾ scores of all columns with no dashes minus the penalties of the gaps. Problem 4 (25%): In this problem, we employ a simple scoring scheme where each gap symbol is penalized by a nonnegative constant ¯. Let S[i; j] denote the score of an optimal alignment between ha1; a2; : : : ; aii and hb1; b2; : : : ; bji. With proper initializations, S[i; j] can be computed by the following recurrences: S[i; j] = max 8< : S[i ¡ 1; j] ¡ ¯ S[i; j ¡ 1] ¡ ¯ S[i ¡ 1; j ¡ 1] + ¾(ai; bj) (a) (15%): Write down a complete pseudo-code for computing S[m; n] in O(mn) time and O(m+n) space. All initializations should be included in the pseudo-code. (b) (10%): Assume that we allow at most three gaps in an alignment. Give a method (as efficient as possible) for computing the score of an optimal alignment. Problem 5 (20%): In affine gap penalties, a gap of length k is penalized by ® + k £ ¯, where ® and ¯ are both nonnegative constants. (a) (10%): Give the recurrence relations for computing the score of an optimal (global) alignment between A and B. Justify your recurrence relations and include all initializations. (b) (10%): Give the recurrence relations for computing the score of an optimal local alignment between A and B. Explain your recurrence relations and include all initializations. Problem 6 (15%): Consider the problem of computing all ¢-points of two sequences of lengths m and n, where m ¿ n. Describe a method for computing all ¢-points that works in O(mn) time and O(m11/10 + n) working space. -- Nothing is Impossible -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 140.112.30.46
andy74139 :已收錄至資訊系精華區!! 05/15 20:26
andy74139 :請問考試日期是不是錯了?? 05/15 20:27
※ 編輯: wanquan 來自: 140.112.30.46 (05/15 23:14)
wanquan :以更正..謝謝 05/15 23:14