我有一个 csv 文件,其中每一行定义给定建筑物中的一个房间。除了房间之外,每行都有一个地板区域。我想要提取的是所有建筑物的所有楼层。
我的文件看起来像这样...
"u_floor","u_room","name"
0,"00BDF","AIRPORT TEST "
0,0,"BRICKER HALL, JOHN W "
0,3,"BRICKER HALL, JOHN W "
0,5,"BRICKER HALL, JOHN W "
0,6,"BRICKER HALL, JOHN W "
0,7,"BRICKER HALL, JOHN W "
0,8,"BRICKER HALL, JOHN W "
0,9,"BRICKER HALL, JOHN W "
0,19,"BRICKER HALL, JOHN W "
0,20,"BRICKER HALL, JOHN W "
0,21,"BRICKER HALL, JOHN W "
0,25,"BRICKER HALL, JOHN W "
0,27,"BRICKER HALL, JOHN W "
0,29,"BRICKER HALL, JOHN W "
0,35,"BRICKER HALL, JOHN W "
0,45,"BRICKER HALL, JOHN W "
0,59,"BRICKER HALL, JOHN W "
0,60,"BRICKER HALL, JOHN W "
0,61,"BRICKER HALL, JOHN W "
0,63,"BRICKER HALL, JOHN W "
0,"0006M","BRICKER HALL, JOHN W "
0,"0008A","BRICKER HALL, JOHN W "
0,"0008B","BRICKER HALL, JOHN W "
0,"0008C","BRICKER HALL, JOHN W "
0,"0008D","BRICKER HALL, JOHN W "
0,"0008E","BRICKER HALL, JOHN W "
0,"0008F","BRICKER HALL, JOHN W "
0,"0008G","BRICKER HALL, JOHN W "
0,"0008H","BRICKER HALL, JOHN W "
我想要的是所有建筑物的所有楼层。
我正在使用 cat、awk、sort 和 uniq 来获取此列表,尽管我在建筑物名称字段中遇到“,”问题,例如“BRICKER HALL,JOHN W”,并且它导致我的整个 csv 生成失败。
cat Buildings.csv | awk -F, '{print $1","$2}' | sort | uniq > Floors.csv
如何让 awk 使用逗号但忽略字段 "" 之间的逗号?或者,有人有更好的解决方案吗?
根据提供的答案建议使用 awk csv 解析器,我能够得到解决方案:
cat Buildings.csv | awk -f csv.awk | awk -F" -> 2|" '{print $2}' | awk -F"|" '{print $2","$3}' | sort | uniq > floors.csv
在那里我们想要使用csv awk http://lorance.freeshell.org/csv/程序,然后从那里我想使用“ -> 2|”这是基于 csv awk 程序的格式化。那里的 print $2 仅打印 csv 解析的内容,这是因为程序打印原始行,后跟“ -> #”,其中 # 是从 csv 解析的计数。 (即列。)从那里我可以将这个 awk csv 结果拆分为“|” whcih 是它替换逗号的内容。然后排序、uniq 并通过管道输出到文件就完成了!
谢谢您的帮助。