新浪博客

pandas_cut分区间操作study

2020-02-23 00:17阅读:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
'''
Created on Sat Feb 22 22:09:45 2020
@author: dandelion
'''
import pandas as pd
import os
file = os.path.join(os.getcwd(), 'log.csv')
data = pd.read_csv(file, header=0, sep=',')
# 区间门限
bins = [-150, -110, -100, -90, -80, -70, -30]
# 各区间的标签
label = labels = [
'(-150,-110]',
'(-110,-100]',
'(-100,-90]',
'(-90,-80]',
'(-80,-70]',
'(-70,-30]',
]
# 分区间操作
data['rsrp_range'] = pd.cut(data['OptimalAvgRSRP'], bins=bins, labels=label, right=True)
# 新增字段查看
columns = data.columns.to_list()
print('数据表头字段清单为:', columns)
print('----------------------------')
# 使用聚合 groupby方式统计各区域采样点数
pdf = data.groupby(data['rsrp_range']).agg({'rsrp_range': 'count'})
# 导出区间统计结果
pdf.to_csv(os.path.join(os.getcwd(), '各区间采样点数by(groupby).csv'))
print('各区间采样点数by(groupby):',
pdf)
print('---------------------')
# 使用pd.value_counts()方式统计各区间的采样点数
value_count = pd.value_counts(data['rsrp_range'], sort=False)
print('各区间采样点数by(value_counts:', value_count)
'''
#代码执行结果示例:
#--------------------
各区间采样点数by(groupby):
rsrp_range
rsrp_range
(-150,-110] 253
(-110,-100] 14
(-100,-90] 186
(-90,-80] 73
(-80,-70] 0
(-70,-30] 0
---------------------
各区间采样点数by(value_counts:
(-150,-110] 253
(-110,-100] 14
(-100,-90] 186
(-90,-80] 73
(-80,-70] 0
(-70,-30] 0
Name: rsrp_range, dtype: int64
'''

我的更多文章

下载客户端阅读体验更佳

APP专享