精華區beta NTU-Exam 關於我們 聯絡資訊
課程名稱︰自然語言處理 課程性質︰系選修 課程教師︰陳信希 開課學院:電資學院 開課系所︰資訊系 考試日期(年月日)︰2013/1/10 考試時限(分鐘):180 是否需發放獎勵金:是的,感謝 (如未明確表示,則不予發放) 試題 : 1. In part of speech tagging, we can employ VITERBI algorithm shown as follows to find the best tag sequence of a given word sequence. This algorithm is composed of initialization step, recursion step, and termination step. Please fill in the missing part in the algorithm. (10 points) function VITERBI(observations of len T,state-graph of len N) return best-path create a path probability matrix viterbi[N+2,T] for each state s from 1 to N do ;initialization step viterbi[s,1]<--a(0,s)*bs(o1) backpointer[s,1]<--0 for each time step t from 2 to T do ;recursion step for each state s from 1 to N do ┌────────────────────┐ │ │ │ missing part │ │ │ │ │ └────────────────────┘ N viterbi[qF,T]<--max viterbi[s,T] * as,qF ;termination step s=1 N backpointer[qF,T]<--argmax viterbi[s,T] * as,qF ;termination step s=1 return the backtrace path by following backpointers to states back in time from backpointer[qF,T] 2. Please define a probabilistic context free grammar (PCFG) formally, and explain how PCFG deals with overgeneration problem. (10 points) 3. In syntactic parsing, how to employ the grammar rules in a systematic way and to avoid the duplication work are two important issues. Please show solutions deal with these problems. (10 points) 4. In statistical parsing, outside probability and inside probability are defined as follows, respectively. αj(p,q) = P(wl(p-1),Njpq,w(q+1)m|G) βj(p,q) = P(wpq|Njpq,G) We can use either inside probabilities or outside probabilities to compute the probability of an input sentence given grammar G. Please show the bottom-up dynamic progrmming algorithm to get the sentence probability. What probability will be used in this algorithm? In what situations will we use both probabilities? (15 points) 5. Most natural language processing adopts notation transformation. For example, a natural language sentence is transformed into a semantic representation without ambiguity, having truth value, and with inference capability. You are asked to design a system to map a sentence into an SQL representation. In this way, SQL query processor can be regarded as a semantic interpreter. Please show the idea to derive the semantic form. Any solution is welcome. (10 points) 6. (a) What is word sense disambiguation? (5 points) (b) Please explain how semantic selection restriction can be used to deal with word sense disambiguation problem. (5 points) (c) Word sense disambiguation can also be formulated as a classfication problem. Please explain how Naive Bayes classifier works in this problem. (5 points) 7. Discourse segmentatino aims at segmenting a document into a linear sequence of subtopics. Please propose a discourse segmentation method and discuss why discourse segmentation is useful. (15 points) 8. Ambiguity is an inherent characteristic of human languages. From the analysis point of view, please give an example of ambiguity for morphological level, lexical level, syntactic level, semantic level, and pragmatic level, respectively. (15 points) 9. In the term project, a sentiment dictionary seems to be useful. However, the coverage of the dictionary is a problem. Do you have any idea to extend the coverage of this dictionary? (Bonus: 10 points) -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 123.193.6.232 ※ 編輯: rod13824 來自: 140.112.30.42 (01/10 22:47)