看板 R_Language 關於我們 聯絡資訊
※ 引述《playaround (打滾)》之銘言: : [問題類型]: : N*1的資料 轉換成M*16 : [軟體熟悉度]: : R初學 : [問題敘述]: : 原始資料(csv檔)資料大致是這樣: : time1 : a = 5 : b = 70 : c = "rest" : ... : ... : time2 : a = 8 : b = 15 : c = "rest_2" : ... : ... : 想要以16列為單位整理成M*16的矩陣 : 第一列是col標題 : 和每列前面的a,b,c等是row標題 : 類似這樣: : time a b c ... : time1 5 70 "rest" : time2 8 15 "rest_2" : 有找一些指令好像都是以同col內同樣資料來分組 : 所以不太知道目前需要做的這功能要怎麼處理 : 手機發文,排版請見諒 : 感謝大家 : ----- : Sent from JPTT on my Xiaomi MI 5. 給另外一種方法參考,然後教你怎麼做自動轉型XD dataStr <- 'time1 a = 5 b = 70 c = "rest" time2 a = 8 b = 15 c = "rest_2" time3 a = 1 b = 45 c = "rest_3"' # 等同於前兩位用readLines讀檔案的txt變數 txt <- strsplit(dataStr, "\n")[[1]] # 把time也取代成同樣的格式 txt[grepl("time", txt)] <- paste0("time = ", txt[grepl("time", txt)]) # 把每一列切割成 column name跟value兩個,然後用cbind合併全部分割的資料 out <- do.call(cbind, strsplit(txt, "\\s+=\\s+")) # 取得column names columnNames <- unique(out[1, ]) # 把每一個column對應的value取成一個list columnList <- lapply(columnNames, function(colname){ type.convert(out[2 , out[1, ] == colname]) # 取出對應名字的值並做自動轉型 }) # 確定每一個欄位長度都一樣 if (length(unique(sapply(out, length))) != 1) stop("每個欄位的長度不一樣,請檢查資料") # 給名字 names(columnList) <- columnNames # 轉成data.frame resultDf <- as.data.frame(columnList) # time a b c # 1 time1 5 70 "rest" # 2 time2 8 15 "rest_2" # 3 time3 1 45 "rest_3" > str(resultDf) 'data.frame': 3 obs. of 4 variables: $ time: Factor w/ 3 levels "time1","time2",..: 1 2 3 $ a : int 5 8 1 $ b : int 70 15 45 $ c : Factor w/ 3 levels "\"rest\"","\"rest_2\"",..: 1 2 3 難得一篇完全沒用套件XD 套件版: library(data.table) library(stringr) library(pipeR) txt <- strsplit(dataStr, "\n")[[1]] txt[str_detect(txt, "time")] <- str_c("time = ", txt[str_detect(txt, "time")]) outDf <- txt %>>% str_detect("time") %>>% cumsum %>>% cbind(do.call(rbind, str_split(txt, "\\s+=\\s+"))) %>>% data.table %>>% setnames(c("id", "var", "value")) %>>% `[`(j = id := NULL) %>>% `[`(j = eval(names(.)) := lapply(.SD, type.convert)) # a b c time # 1: 5 70 "rest" time1 # 2: 8 15 "rest_2" time2 # 3: 1 45 "rest_3" time3 > str(outDf) Classes ‘data.table’ and 'data.frame': 3 obs. of 4 variables: $ a : int 5 8 1 $ b : int 70 15 45 $ c : Factor w/ 3 levels "\"rest\"","\"rest_2\"",..: 1 2 3 $ time: Factor w/ 3 levels "time1","time2",..: 1 2 3 - attr(*, ".internal.selfref")=<externalptr> -- R資料整理套件系列文: magrittr #1LhSWhpH (R_Language) https://goo.gl/72l1m9 data.table #1LhW7Tvj (R_Language) https://goo.gl/PZa6Ue dplyr(上.下) #1LhpJCfB,#1Lhw8b-s (R_Language) https://goo.gl/I5xX9b tidyr #1Liqls1R (R_Language) https://goo.gl/i7yzAz pipeR #1NXESRm5 (R_Language) https://goo.gl/zRUISx -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 111.253.88.5 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1503414254.A.448.html ※ 編輯: celestialgod (111.253.88.5), 08/23/2017 01:04:19