ArcPy数据处理应用之批量统计唯一值与数据量
某中部省会CS市国土资源测绘院项目,要制定新的基础地形数据标准,同时结合旧数据情况,进行必要的分析和处理,将其导入到新标准的数据库中去。对旧数据的分析和批处理工作,用Python再合适不过。
第一个处理任务,是批量统计数据的唯一值(字段名:YSDM,字段别名:要素代码),取得唯一值对应的备注信息(字段名:REMARK),并统一该唯一值对应的要素个数,将结果输出到文本文件中。对已经处理过的部分要素类,可以跳过处理。
相应的代码如下:
>>> import codecs
>>> import arcpy
>>> count = 0
... skipCount = 0
... pCount = 0
... processed = ["CDD_KDCSG","CDM_KDCSG","CSLDD_KDCSG"]
... tf = codecs.open(u"D:/result.txt",'w','utf-8')
... arcpy.env.workspace = u'D:/data.gdb'
... dss = arcpy.ListDatasets()
... for ds in dss:
... fcs = arcpy.ListFeatureClasses(feature_dataset = ds)
... for fc in fcs:
... pCount = pCount + 1
... if fc in processed:
... skipCount = skipCount + 1
... print u"Skip {0}. SkipCount {1} TotalCount{2}".format(fc, skipCount, pCount)
... continue
... print fc
... field = 'YSDM'
... fields = arcpy.ListFields(fc)
... hasField = False
... for af in fields:
... print u"Enum Field {0}".format(af.name)
... if af.name.upper() == field:
... hasField = True
... break
... if not hasField:
... print u"No Target Field"
... continue
... values = [row[0] for row in arcpy.da.SearchCursor(fc, field)]
... uvs = set(values)
... print uvs
... for uv in uvs:
... if not uv == '':
... where = u"{0}='{1}'".format(field, uv)
... fcount = len([feature[0] for feature in arcpy.da.SearchCursor(fc, field, where)])
... rows = arcpy.SearchCursor(fc,u"{0}='{1}'".format(field, uv),fields="YSDM;REMARK")
... for aRow in rows:
... remark = aRow.getValue("REMARK")
... count = count + 1
... avalue = u"{0},{1},{2},{3},{4}".format(count, fc, uv, remark, fcount)
... print avalue
... tf.write(u"{0};".format(avalue))
... break
... tf.close()
统计结果如下(截取部分):
139,LDM_KDCSG,810507013,幼林.苗圃边界,20919
140,LDM_KDCSG,810501013,成林边界,16001
141,LDM_KDCSG,810502013,未成林边界,384
142,LDM_KDCSG,810502,幼林,261
143,LDM_KDCSG,810503,灌木林,445
144,LDM_KDCSG,810500, ,510
145,LDM_KDCSG,810501,成林,3
146,LDM_KDCSG,810507,苗圃,1
147,LDM_KDCSG,810504,竹林,29
148,LDM_KDCSG,810504071,单个大面积竹林符号,30
149,LDM_KDCSG,810504013,大面积竹林边界,6800
150,LDM_KDCSG,810508003,防火带,38
151,LDM_KDCSG,810503013,大面积灌木林边界,10036
152,LDM_KDCSG,810507063,苗圃(面),7471
153,LDM_KDCSG,810406013,其他经济林边界,136
154,LDX_KDCSG,810503,灌木林,506
155,LDX_KDCSG,810513,特殊树,626
156,LDX_KDCSG,810506,迹地,1
157,LDX_KDCSG,810507,苗圃,256
158,LDX_KDCSG,810504,竹林,241
159,LDX_KDCSG,810510,行树,6610
160,LDX_KDCSG,810508,防火带,2
161,LDX_KDCSG,810510002,行树,595
162,LDX_KDCSG,810510012,行树,11874
163,LDX_KDCSG,810534, ,13
164,LDX_KDCSG,810510072,灌木行树,217
165,LDX_KDCSG,810559, ,6
166,LDX_KDCSG,810504042,狭长的竹林,4
167,LDX_KDCSG,810503042,狭长灌木林,14
168,TGX_KDCSG,810403,菜园,30
169,TGX_KDCSG,810401,果园,701
170,TGX_KDCSG,810200012,单线田埂,94573
171,TGX_KDCSG,810405,其他园地,7
172,TGX_KDCSG,810200034,双线田埂左边,2
173,TGX_KDCSG,810304,水生作物地,35
174,TGX_KDCSG,810301,稻田,502
175,TGX_KDCSG,810302,旱地,1196
176,TGX_KDCSG,810303,菜地,800
177,TGX_KDCSG,810200,田埂,1755
178,TZM_KDCSG,830401003,沙砾地、戈壁滩,1
179,TZM_KDCSG,830100013,盐碱地边界,16
180,TZM_KDCSG,830404013,沙泥地边界,1
181,TZM_KDCSG,830402013,石块地边界,10
182,TZM_KDCSG,830401013,沙砾滩边界,1
183,YDD_KDCSG,810405024,经济作物地符号,7
184,YDD_KDCSG,810427, ,1
185,YDD_KDCSG,810426, ,27
186,YDD_KDCSG,810403024,茶园符号,26
187,YDD_KDCSG,810416, ,164
188,YDD_KDCSG,810417, ,8
189,YDD_KDCSG,810407, ,358
190,YDD_KDCSG,810406, ,6283
191,YDD_KDCSG,810403031,茶园,419
192,YDD_KDCSG,810401031,果园,5007
193,YDD_KDCSG,810401024,果园符号,486
194,YDD_KDCSG,810405031,单个经济作物地符号,13
195,YDD_KDCSG,810405054,其他园地,327
196,YDM_KDCSG,810405013,经济作物地边界,454
197,YDM_KDCSG,810401063,果园(面),980
198,YDM_KDCSG,810403013,茶园边界,1299
转载自:https://blog.csdn.net/a_dev/article/details/89413067