課程名稱︰自然語言處理
課程性質︰系選修
課程教師︰陳信希
開課學院:電資學院
開課系所︰資訊系
考試日期(年月日)︰2013/1/10
考試時限(分鐘):180
是否需發放獎勵金:是的,感謝
(如未明確表示,則不予發放)
試題 :
1. In part of speech tagging, we can employ VITERBI algorithm shown as follows
to find the best tag sequence of a given word sequence. This algorithm is
composed of initialization step, recursion step, and termination step.
Please fill in the missing part in the algorithm. (10 points)
function VITERBI(observations of len T,state-graph of len N)
return best-path
create a path probability matrix viterbi[N+2,T]
for each state s from 1 to N do ;initialization step
viterbi[s,1]<--a(0,s)*bs(o1)
backpointer[s,1]<--0
for each time step t from 2 to T do ;recursion step
for each state s from 1 to N do
┌────────────────────┐
│ │
│ missing part │
│ │
│ │
└────────────────────┘
N
viterbi[qF,T]<--max viterbi[s,T] * as,qF ;termination step
s=1
N
backpointer[qF,T]<--argmax viterbi[s,T] * as,qF ;termination step
s=1
return the backtrace path by following backpointers to states back in time
from backpointer[qF,T]
2. Please define a probabilistic context free grammar (PCFG) formally, and
explain how PCFG deals with overgeneration problem. (10 points)
3. In syntactic parsing, how to employ the grammar rules in a systematic way
and to avoid the duplication work are two important issues. Please show
solutions deal with these problems. (10 points)
4. In statistical parsing, outside probability and inside probability are
defined as follows, respectively.
αj(p,q) = P(wl(p-1),Njpq,w(q+1)m|G)
βj(p,q) = P(wpq|Njpq,G)
We can use either inside probabilities or outside probabilities to compute
the probability of an input sentence given grammar G. Please show the
bottom-up dynamic progrmming algorithm to get the sentence probability.
What probability will be used in this algorithm? In what situations will we
use both probabilities? (15 points)
5. Most natural language processing adopts notation transformation. For
example, a natural language sentence is transformed into a semantic
representation without ambiguity, having truth value, and with inference
capability. You are asked to design a system to map a sentence into an SQL
representation. In this way, SQL query processor can be regarded as a
semantic interpreter. Please show the idea to derive the semantic form.
Any solution is welcome. (10 points)
6. (a) What is word sense disambiguation? (5 points)
(b) Please explain how semantic selection restriction can be used to deal
with word sense disambiguation problem. (5 points)
(c) Word sense disambiguation can also be formulated as a classfication
problem. Please explain how Naive Bayes classifier works in this
problem. (5 points)
7. Discourse segmentatino aims at segmenting a document into a linear sequence
of subtopics. Please propose a discourse segmentation method and discuss
why discourse segmentation is useful. (15 points)
8. Ambiguity is an inherent characteristic of human languages. From the
analysis point of view, please give an example of ambiguity for
morphological level, lexical level, syntactic level, semantic level, and
pragmatic level, respectively. (15 points)
9. In the term project, a sentiment dictionary seems to be useful. However, the
coverage of the dictionary is a problem. Do you have any idea to extend the
coverage of this dictionary? (Bonus: 10 points)
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 123.193.6.232
※ 編輯: rod13824 來自: 140.112.30.42 (01/10 22:47)