pandas 计算年龄分组计算非重复计数和年龄小于7岁非重复计数

六月 12, 2023

import pandas as pd

import numpy as np

import janitor

sc=(
    pd.read_excel("/mnt/c/Users/xuefeng/Downloads/非重卡删除.xlsx",dtype={'SC_GLDW_BM':'object','YM_BM':'object'}).clean_names()
)

(
    sc.query('ym_mc.str.startswith("新冠")')
    .astype({'sc_gldw_bm':'string'})
    .assign(shi=lambda x:x.sc_gldw_bm.str[0:4])
    .groupby('shi')
    .agg(count=('shi','count'))
    .reset_index()
    .sort_values('shi')
    .to_excel("/mnt/c/Users/xuefeng/Downloads/非重卡删除1.xlsx")
)


test = (
    sc.assign(xian=sc.sc_gldw_bm.str[:6], shi=sc.sc_gldw_bm.str.slice(0,4),csrq=sc.zjhm.str[6:14])
    .query("sc_gldw_bm.str.startswith('6211') & csrq.str.len()==8 & csrq.str.slice(0,2) in ('19','20')")
    .assign(age=lambda x:(pd.to_datetime(x.jz_sj)-pd.to_datetime(x.csrq,format='%Y%m%d',errors='coerce'))/pd.Timedelta(days=365.25))
)

(
    test
    .groupby('xian')
    .agg(n=('zjhm', 'nunique'), age7=('zjhm', lambda x: x.loc[test.age <= 7].nunique()))
)

搜索此博客

xuefliang

pandas 计算年龄分组计算非重复计数和年龄小于7岁非重复计数

评论

发表评论

此博客中的热门博文

windows 命令行下查看端口占用情况的方法

Android 7.0 开启Google Now

Rstudio 使用代理

pandas 计算年龄分组计算非重复计数和年龄小于7岁非重复计数

评论

发表评论

此博客中的热门博文

windows 命令行下 查看端口占用情况的方法

Android 7.0 开启Google Now

Rstudio 使用代理

windows 命令行下查看端口占用情况的方法