Re: [程式] SAS刪除substring重複的樣本

作者freedomyang (Be Simple)

看板Statistics

標題Re: [程式] SAS刪除substring重複的樣本

時間Tue Jul 13 19:28:55 2021

原文恕刪。覺得這個需求滿有趣的，過去沒有碰過，想了另外一種解法，但可能較不易閱讀。 data DsOut; set have; chk =.; run; data _null_; set have; call execute("proc sql noprint;"); call execute(" update DsOut a"); call execute(" set chk = (select chk from"); call execute(" (select ObsName, sum(prxmatch('/"||strip(ObsName)||"/', ObsName))-1 as chk"); call execute(" from have having ObsName = '"||strip(ObsName)||"') b "); call execute(" where a.ObsName = b.ObsName)"); call execute(" where a.ObsName = '"||strip(ObsName)||"';"); call execute("quit;"); run; data DsOut_final; set DsOut; where chk = 0; run; 使用call execute針對每筆record做事，搭配proc sql針對單一column去運算。大致想法是抓每一筆record用正則表示式在dataset裡面看有幾筆相符(-1是扣掉自身) 註:這裡假設data沒有duplicate record 把相符的筆數回傳後只取chk=0，代表是unique的。 BTW, 正則的條件式沒有考慮很嚴謹，可再自行修改。 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 123.193.213.12 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/Statistics/M.1626175737.A.C8A.html ※ 編輯: freedomyang (123.193.213.12 臺灣), 07/13/2021 19:35:27

推 Meidien: 感謝f大，我來研究看看! 07/14 10:58