看板 NTU-Exam 關於我們 聯絡資訊
課程名稱︰ 機率 課程性質︰ 系必修 課程教師︰ 林守德 開課學院: 電資學院 開課系所︰ 資工系 考試日期(年月日)︰ 2008/06/19 考試時限(分鐘): 180 是否需發放獎勵金: 是 (如未明確表示,則不予發放) 試題 : 1.Let X have the following probability density function: -1 -0.5 2 2 f(x) = σ (2π) EXP( -(x-μ) / 2σ ) What is the probability density function of Y = EXP(X) ? 2.Person A throws an unbiased dice n times and B throws the same dice n+1 times. We care about how many '6's they throw. If you are told that P(B has more '6's than A) = 5/12 then what is the probability that A and B have equally many '6's after throwing the dice n times? Hint: conditioning on which player has more '6's after each has thrown n times. 3.Your company must make a sealed bid for a construction project. Your company will win if your bid is lower than other companies. If you win the bid, then you play to pay another firm 100 thousand dollars to do the work. You are competing with two other companies, and you believe their bids are two independent ramdom variables uniformly distribution in [70,250] and [140,300], respectively. (a) Suppose your bid is x, what is the probability that you win? (b) Suppoes your bid is x, what is the expected profit? (c) Determine the x that maximizes your profit. 4.X, Y, and Z are three random variables. Can you propose a real-world example of them that satisfy the following condition: (a) X and Y are independent. (b) X and Y become dependent given Z. 5.T1 and T2 are two positive continuous random variables that satisfy: ˙T1 > T2 ˙T1 + T2 < 2 Their joint density function is uniform in the above region, and is zero elsewhere. What is P(T1 + T2 > 1)? 6.Let X and Y be random variables of the continuous type having the joint p.d.f.: f(x,y) = 2, 0 <= y <= x <= 1. (a) What are the means of X and Y? (b) What is the covariance of X and Y? 7.A public poll was taken to determine whether we should allow tourist from Mainland China. let p equal the proportion of people who faver this decision. We shall test H0: p = 0.65 against H1: p > 0.65. (a) Given α = 0.025, what is the critical region? (b) Given that 414 out of a sample of 600 favor this proposal, find the p-value. (c) Should we reject or accept H0? 8.The teacher claims that 1/4 of the student will recieve A grade. 1/4 will receive B and 1/2 will receive C grade. If among the 40 students, 6 receive A, 7 receive B and 27 receive C. Would the claim be rejected at σ = 0.05 significance level? 9.A six-sided fair die is rolled. What is the mutual information between the topside and the front side (the side most facing you)? Hint: The sum of two opposite sides is always 7. 10.Half of the Taiwanese students in the class get high score, and 2/3 of the students in the class are Taiwanese. Only 1/10 of the non-Taiwanese students get high score. (a) Define the random variable and draw the Bayesian network (with conditional probability table) for this statement. (b) What is the probability that a randomly chosen student is a Taiwanese who gets high score? (c) Given an association rule that says "Japanese = true" → "score = high" please provide a pair of "reasonable" min-support and min-confidence that make this rule true. 11.Given the following Bayesian network, ┌─┐ │H│ P(H) = 1/2 ↙└─┘↘ P(S|H) = 1/10 ┌─┐ ┌─┐ P(F|H) = 0 P(S|~H) = 1/2 │S│ │F│ P(F|~H) = 1/2 ↙└─┘ └─┘ ┌─┐ P(W|S) = 1/2 │W│ P(W|~S) = 1 └─┘ please calculate P(W,F). 12.Corpus C consists of only three document: D1: "new york times" D2: "new york post" D3: "los angeles times" (a) Please use the vector-space model to represent these three document, assuming the weights are all binary and the words in the vector are ordered alphabetically. (b) Please use the vector-space model to represent these three document, assuming the weights are TFIDF values and assuming that term frequencies are normalize by the maximum frequency in a document. Note: Please use the base-10 logarithm with the following table: 2 3 4 5 6 7 8 9 ──┼──────────────────────────────── log│0.3010 0.4771 0.6021 0.6990 0.7782 0.8451 0.9031 0.9542 (c) Given the following query: "new new times," calculate the correspoding TFIDF-based vector, and compute its distance with D1 using the cosine similarity measure. Assume that term frequencies are normalized by the maximum frequency in a given query. 13.Given a social network: ┌─┐ calls │N1│→→→→→→→→→→↘ └─┘ ↓ ↑calls ↓ ↑ ↓ ┌──┐ emails ┌──┐ │Sue │←←←←←←← │Jean│ └──┘ ↙└──┘ ↓emails ↙ ↓calls ↓ ↙ ↓ ┌─┐ ↙ ┌─┐ │P1│←←← ↙ │C1│ └─┘ calls └─┘ There are six paths of length 2: calls calls Sue →→→ N1 →→→ Jean emails emails Jean →→→ Sue →→→ P1 emails calls Jean →→→ Sue →→→ N1 calls calls N1 →→→ Jean →→→ C1 calls calls N1 →→→ Jean →→→ P1 calls emails N1 →→→ Jean →→→ Sue If we perform a random experiment to pick a length-2 path randomly, and define two random variables S and P: S: the starting node of the path (eg. "Sue") P: the link-combination of the path (eg. {calls, emails},{calls, calls}) (a) What is the size of P's outcome space? (b) What is the mutual information I(S;P)? (c) Assume that min-support is 0.3 and min-confidence is 0.7, can we conclude an association rule N1 → {calls, calls}? Why? (d) Assume the initial PageRank values for each node is 0.2. Which node(s) have the highest PageRank values after two iteration? 14.The problem of "Chinese poetry segmentation" aims at breaking a Chinese poetry sentence into a section of term, for example, "夜半鐘聲到客船" → "夜半 鐘聲 到 客船" Can you carefully describe a way to use n-gram LM to do this job? Hint: You need to determine not only where to put the breaks but also how many breaks there are. -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 61.229.232.184