13033 lines
417 KiB
Plaintext
13033 lines
417 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "196a647a-6faa-4aee-a0bf-a345852251dd",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 深入浅出pandas-1\n",
|
||
"\n",
|
||
"pandas是一个支持数据分析全流程的Python开源库,它的作者Wes McKinney于2008年开始开发这个库,其主要目标是提供一个大数据分析和处理的工具。pandas封装了从数据加载、数据重塑、数据清洗到数据透视、数据呈现等一系列操作,提供了三种核心的数据类型:\n",
|
||
"1. `Series`:数据系列,表示一维的数据。跟一维数组的区别在于每条数据都有对应的索引,处理数据的方法比`ndarray`更为丰富。\n",
|
||
"2. `DataFrame`:数据框、数据窗、数据表,表示二维的数据。跟二维数组相比,`DataFrame`有行索引和列索引,而且提供了100+方法来处理数据。\n",
|
||
"3. `Index`:为`Series`和`DataFrame`提供索引服务。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "eb84f909-921a-47da-87b1-61578c871422",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"import pandas as pd\n",
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"plt.rcParams['font.sans-serif'].insert(0, 'SimHei')\n",
|
||
"plt.rcParams['axes.unicode_minus'] = False\n",
|
||
"get_ipython().run_line_magic('config', \"InlineBackend.figure_format = 'svg'\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2102e83e-2a6d-47aa-b449-c058bea1a601",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 创建DataFrame对象"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "87dbde08-dcab-4ede-a791-b56e11dd9115",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"np.random.seed(20)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "4c5b2767-2074-4cdf-b1ba-beff6f425942",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 95, 86, 75],\n",
|
||
" [ 91, 88, 86],\n",
|
||
" [ 69, 80, 71],\n",
|
||
" [ 82, 67, 94],\n",
|
||
" [ 92, 100, 81]])"
|
||
]
|
||
},
|
||
"execution_count": 3,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"stu_names = ['狄仁杰', '白起', '李元芳', '苏妲己', '孙尚香']\n",
|
||
"cou_names = ['语文', '数学', '英语']\n",
|
||
"scores_arr = np.random.randint(60, 101, (5, 3))\n",
|
||
"scores_arr"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "f8c2a6bf-ca5e-479d-ab63-f5c3620186e3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>语文</th>\n",
|
||
" <th>数学</th>\n",
|
||
" <th>英语</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>狄仁杰</th>\n",
|
||
" <td>95</td>\n",
|
||
" <td>86</td>\n",
|
||
" <td>75</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>白起</th>\n",
|
||
" <td>91</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>86</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>李元芳</th>\n",
|
||
" <td>69</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>71</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>苏妲己</th>\n",
|
||
" <td>82</td>\n",
|
||
" <td>67</td>\n",
|
||
" <td>94</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>孙尚香</th>\n",
|
||
" <td>92</td>\n",
|
||
" <td>100</td>\n",
|
||
" <td>81</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 语文 数学 英语\n",
|
||
"狄仁杰 95 86 75\n",
|
||
"白起 91 88 86\n",
|
||
"李元芳 69 80 71\n",
|
||
"苏妲己 82 67 94\n",
|
||
"孙尚香 92 100 81"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 方法一:通过二维数组构造DataFrame对象\n",
|
||
"df1 = pd.DataFrame(data=scores_arr, columns=cou_names, index=stu_names)\n",
|
||
"df1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "baad5381-fb7d-4cc9-9288-a05d750144af",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['狄仁杰', '白起', '李元芳', '苏妲己', '孙尚香'], dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 行索引\n",
|
||
"df1.index"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "d7f06b76-b60b-49cb-be72-adafb0978fca",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['语文', '数学', '英语'], dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 列索引\n",
|
||
"df1.columns"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "13b1275d-77e5-4d5d-b227-19db3f4196fd",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 95, 86, 75],\n",
|
||
" [ 91, 88, 86],\n",
|
||
" [ 69, 80, 71],\n",
|
||
" [ 82, 67, 94],\n",
|
||
" [ 92, 100, 81]])"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 值 - 二维数组\n",
|
||
"df1.values"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "dbf5bb11-1600-4ae4-bc95-369bc8189c20",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"scores_dict = {\n",
|
||
" '语文': [95, 91, 69, 82, 92],\n",
|
||
" '数学': [86, 88, 80, 67, 100],\n",
|
||
" '英语': [75, 86, 71, 94, 81]\n",
|
||
"}"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"id": "c300bbbd-329a-4852-bf76-78ce1de02b8f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>语文</th>\n",
|
||
" <th>数学</th>\n",
|
||
" <th>英语</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>狄仁杰</th>\n",
|
||
" <td>95</td>\n",
|
||
" <td>86</td>\n",
|
||
" <td>75</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>白起</th>\n",
|
||
" <td>91</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>86</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>李元芳</th>\n",
|
||
" <td>69</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>71</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>苏妲己</th>\n",
|
||
" <td>82</td>\n",
|
||
" <td>67</td>\n",
|
||
" <td>94</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>孙尚香</th>\n",
|
||
" <td>92</td>\n",
|
||
" <td>100</td>\n",
|
||
" <td>81</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 语文 数学 英语\n",
|
||
"狄仁杰 95 86 75\n",
|
||
"白起 91 88 86\n",
|
||
"李元芳 69 80 71\n",
|
||
"苏妲己 82 67 94\n",
|
||
"孙尚香 92 100 81"
|
||
]
|
||
},
|
||
"execution_count": 9,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 方法二:通过数据字典构造DataFrame对象\n",
|
||
"df2 = pd.DataFrame(data=scores_dict, index=stu_names)\n",
|
||
"df2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "705c0de6-43ff-46c6-85d5-301743d18d43",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 5 entries, 狄仁杰 to 孙尚香\n",
|
||
"Data columns (total 3 columns):\n",
|
||
" # Column Non-Null Count Dtype\n",
|
||
"--- ------ -------------- -----\n",
|
||
" 0 语文 5 non-null int64\n",
|
||
" 1 数学 5 non-null int64\n",
|
||
" 2 英语 5 non-null int64\n",
|
||
"dtypes: int64(3)\n",
|
||
"memory usage: 558.0 bytes\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# 查看DataFrame信息\n",
|
||
"df2.info(memory_usage='deep')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"id": "71417ac2-8f4b-4950-9336-de6fbc1f5da4",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>202300001</td>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>202300002</td>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>202300003</td>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>202300004</td>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>202300005</td>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5998</th>\n",
|
||
" <td>202305999</td>\n",
|
||
" <td>曹恰</td>\n",
|
||
" <td>1983-09</td>\n",
|
||
" <td>首都师范大学科德学院</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5999</th>\n",
|
||
" <td>202306000</td>\n",
|
||
" <td>罗佳</td>\n",
|
||
" <td>1981-05</td>\n",
|
||
" <td>厦门方胜众合企业服务有限公司海淀分公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6000</th>\n",
|
||
" <td>202306001</td>\n",
|
||
" <td>席盛代</td>\n",
|
||
" <td>1983-06</td>\n",
|
||
" <td>中国华能集团清洁能源技术研究院有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6001</th>\n",
|
||
" <td>202306002</td>\n",
|
||
" <td>彭芸芸</td>\n",
|
||
" <td>1981-09</td>\n",
|
||
" <td>北京汉杰凯德文化传播有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6002</th>\n",
|
||
" <td>202306003</td>\n",
|
||
" <td>张越</td>\n",
|
||
" <td>1982-01</td>\n",
|
||
" <td>大爱城投资控股有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6003 rows × 5 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 公示编号 姓名 出生年月 单位名称 积分分值\n",
|
||
"0 202300001 张浩 1977-02 北京首钢股份有限公司 140.05\n",
|
||
"1 202300002 冯云 1982-02 中国人民解放军空军二十三厂 134.29\n",
|
||
"2 202300003 王天东 1975-01 中建二局第三建筑工程有限公司 133.63\n",
|
||
"3 202300004 陈军 1976-07 中建二局第三建筑工程有限公司 133.29\n",
|
||
"4 202300005 樊海瑞 1981-06 中国民生银行股份有限公司 132.46\n",
|
||
"... ... ... ... ... ...\n",
|
||
"5998 202305999 曹恰 1983-09 首都师范大学科德学院 109.92\n",
|
||
"5999 202306000 罗佳 1981-05 厦门方胜众合企业服务有限公司海淀分公司 109.92\n",
|
||
"6000 202306001 席盛代 1983-06 中国华能集团清洁能源技术研究院有限公司 109.92\n",
|
||
"6001 202306002 彭芸芸 1981-09 北京汉杰凯德文化传播有限公司 109.92\n",
|
||
"6002 202306003 张越 1982-01 大爱城投资控股有限公司 109.92\n",
|
||
"\n",
|
||
"[6003 rows x 5 columns]"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 方法三:从CSV文件加载数据创建DataFrame对象\n",
|
||
"df3 = pd.read_csv(\n",
|
||
" 'res/2023年北京积分落户数据.csv',\n",
|
||
" # encoding='utf-8', # 指定字符编码\n",
|
||
" # sep='', # 指定字段的分隔符(默认逗号)\n",
|
||
" # delimiter='#',\n",
|
||
" # header=0, # 表头所在的行\n",
|
||
" # quotechar='\"', # 包裹字符串的字符(默认双引号)\n",
|
||
" # index_col='公示编号', # 索引列\n",
|
||
" # usecols=['公示编号', '姓名', '积分分值'], # 指定加载的列\n",
|
||
" # nrows=10, # 加载的行数\n",
|
||
" # skiprows=np.arange(1, 101), # 跳过哪些行\n",
|
||
" # true_values=['是', 'Yes', 'YES'], # 哪些值会被视为布尔值True\n",
|
||
" # false_values=['否', 'No', 'NO'], # 哪些值会被视为布尔值False\n",
|
||
" # na_values=['---', 'N/A'], # 哪些值会被视为空值\n",
|
||
" # iterator=True, # 开启迭代器模式\n",
|
||
" # chunksize=1000, # 每次加载的数据体量\n",
|
||
")\n",
|
||
"df3"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"id": "e9bd62fd-19d2-4ac1-97a6-3f6a0542e1df",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# %pip install openpyxl"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "cb3387b9-3402-4b25-a5d5-ff9690a1ac06",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351\n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016\n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 169 85 6320\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452\n",
|
||
"... ... ... ... ... ... ... ... ...\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435\n",
|
||
"\n",
|
||
"[1945 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 方法四:从Excel文件加载数据创建DataFrame对象\n",
|
||
"df6 = pd.read_excel(\n",
|
||
" 'res/2020年销售数据.xlsx',\n",
|
||
" sheet_name='data',\n",
|
||
")\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "d06abbd8-9a34-4ab3-a75c-76e3ed8eb36c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# %pip install -U pymysql cryptography sqlalchemy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"id": "5aa0e35f-2a13-4c8e-a9fd-87b0bf72307e",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Engine(mysql+pymysql://guest:***@47.109.26.237:3306/hrs)"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 方法五:从数据服务器加载数据创建DataFrame对象\n",
|
||
"from sqlalchemy import create_engine\n",
|
||
"\n",
|
||
"# URL \n",
|
||
"engine = create_engine('mysql+pymysql://guest:Guest.618@47.109.26.237:3306/hrs')\n",
|
||
"engine"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"id": "4b344f17-f5a1-4d7d-ad3c-ede4b122609c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>dno</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>20</th>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>30</th>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>40</th>\n",
|
||
" <td>运维部</td>\n",
|
||
" <td>深圳</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" dname dloc\n",
|
||
"dno \n",
|
||
"10 会计部 北京\n",
|
||
"20 研发部 成都\n",
|
||
"30 销售部 重庆\n",
|
||
"40 运维部 深圳"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"dept_df = pd.read_sql('tb_dept', engine, index_col='dno')\n",
|
||
"dept_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"id": "c5d1ffa3-6962-4c26-ae92-a8d7bc7da0cb",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>eno</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1359</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2056</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3088</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3211</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3233</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3244</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3251</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3344</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3577</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3588</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4466</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>30</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5234</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5566</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7800</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno\n",
|
||
"eno \n",
|
||
"1359 胡一刀 销售员 3344.0 1800 200.0 30\n",
|
||
"2056 乔峰 分析师 7800.0 5000 1500.0 20\n",
|
||
"3088 李莫愁 设计师 2056.0 3500 800.0 20\n",
|
||
"3211 张无忌 程序员 2056.0 3200 NaN 20\n",
|
||
"3233 丘处机 程序员 2056.0 3400 NaN 20\n",
|
||
"3244 欧阳锋 程序员 3088.0 3200 NaN 20\n",
|
||
"3251 张翠山 程序员 2056.0 4000 NaN 20\n",
|
||
"3344 黄蓉 销售主管 7800.0 3000 800.0 30\n",
|
||
"3577 杨过 会计 5566.0 2200 NaN 10\n",
|
||
"3588 朱九真 会计 5566.0 2500 NaN 10\n",
|
||
"4466 苗人凤 销售员 3344.0 2500 NaN 30\n",
|
||
"5234 郭靖 出纳 5566.0 2000 NaN 10\n",
|
||
"5566 宋远桥 会计师 7800.0 4000 1000.0 10\n",
|
||
"7800 张三丰 总裁 NaN 9000 1200.0 20"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"emp_df1 = pd.read_sql('tb_emp', engine, index_col='eno')\n",
|
||
"emp_df1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"id": "f84b6886-09d8-4f13-89cc-487574991dba",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>eno</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>9500</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9600</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9700</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9800</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9900</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno\n",
|
||
"eno \n",
|
||
"9500 张三丰 总裁 NaN 50000 8000 20\n",
|
||
"9600 王大锤 程序员 9800.0 8000 600 20\n",
|
||
"9700 张三丰 总裁 NaN 60000 6000 20\n",
|
||
"9800 骆昊 架构师 7800.0 30000 5000 20\n",
|
||
"9900 陈小刀 分析师 9800.0 10000 1200 20"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"emp_df2 = pd.read_sql('tb_emp2', engine, index_col='eno')\n",
|
||
"emp_df2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"id": "c60e96d2-9a0d-4901-b39c-c31760de47a0",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 关闭连接释放资源\n",
|
||
"engine.connect().close()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "12086a7a-c161-4753-9a8e-180f9e8b2edf",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 查看信息"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"id": "785e58f9-b3f7-49a6-affc-8caaa66cebf1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"RangeIndex: 1945 entries, 0 to 1944\n",
|
||
"Data columns (total 8 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 销售日期 1945 non-null datetime64[ns]\n",
|
||
" 1 销售区域 1945 non-null object \n",
|
||
" 2 销售渠道 1945 non-null object \n",
|
||
" 3 销售订单 1945 non-null object \n",
|
||
" 4 品牌 1945 non-null object \n",
|
||
" 5 售价 1945 non-null int64 \n",
|
||
" 6 销售数量 1945 non-null int64 \n",
|
||
" 7 直接成本 1945 non-null int64 \n",
|
||
"dtypes: datetime64[ns](1), int64(3), object(4)\n",
|
||
"memory usage: 121.7+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"id": "fd8a9156-3939-430d-9738-60b3d8a95563",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351\n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016\n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 169 85 6320"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 获取前N行\n",
|
||
"df6.head(3)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"id": "b75ace23-9b92-4425-b58f-bcd81e8d72e7",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 获取后N行\n",
|
||
"df6.tail(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c2b2a909-0b40-473c-bb3f-85aca1925a19",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 操作行、列、单元格"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"id": "fe964b3b-7f51-4202-b528-f5102d9be9f0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 2020-01-01\n",
|
||
"1 2020-01-01\n",
|
||
"2 2020-01-01\n",
|
||
"3 2020-01-01\n",
|
||
"4 2020-01-01\n",
|
||
" ... \n",
|
||
"1940 2020-12-30\n",
|
||
"1941 2020-12-30\n",
|
||
"1942 2020-12-31\n",
|
||
"1943 2020-12-31\n",
|
||
"1944 2020-12-31\n",
|
||
"Name: 销售日期, Length: 1945, dtype: datetime64[ns]"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问列\n",
|
||
"df6['销售日期']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"id": "b2e5ccb3-4b97-4a02-8316-b1321390f286",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 拼多多\n",
|
||
"1 抖音\n",
|
||
"2 天猫\n",
|
||
"3 天猫\n",
|
||
"4 天猫\n",
|
||
" ... \n",
|
||
"1940 京东\n",
|
||
"1941 实体\n",
|
||
"1942 实体\n",
|
||
"1943 抖音\n",
|
||
"1944 天猫\n",
|
||
"Name: 销售渠道, Length: 1945, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.销售渠道"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"id": "80ad78dc-4f47-4421-8478-ba7797350db4",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 拼多多\n",
|
||
"1 抖音\n",
|
||
"2 天猫\n",
|
||
"3 天猫\n",
|
||
"4 天猫\n",
|
||
" ... \n",
|
||
"1940 京东\n",
|
||
"1941 实体\n",
|
||
"1942 实体\n",
|
||
"1943 抖音\n",
|
||
"1944 天猫\n",
|
||
"Name: 销售渠道, Length: 1945, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6['销售渠道']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"id": "7b970671-6f16-4e07-8666-715495de2832",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"pandas.core.series.Series"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"type(df6['销售日期'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"id": "2c9cb56b-6a2b-479e-8c57-c61683858387",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>拼多多</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>抖音</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>天猫</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>天猫</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>天猫</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>京东</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>实体</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>实体</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>抖音</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>天猫</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 1 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售渠道\n",
|
||
"0 拼多多\n",
|
||
"1 抖音\n",
|
||
"2 天猫\n",
|
||
"3 天猫\n",
|
||
"4 天猫\n",
|
||
"... ...\n",
|
||
"1940 京东\n",
|
||
"1941 实体\n",
|
||
"1942 实体\n",
|
||
"1943 抖音\n",
|
||
"1944 天猫\n",
|
||
"\n",
|
||
"[1945 rows x 1 columns]"
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6[['销售渠道']]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"id": "75730cd3-0459-4a62-97ee-e037256cc98a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"pandas.core.frame.DataFrame"
|
||
]
|
||
},
|
||
"execution_count": 28,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"type(df6[['销售渠道']])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"id": "9e097e49-b762-4c9f-9d93-98abb1701d97",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>1016</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>485</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>2452</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 3 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 直接成本\n",
|
||
"0 2020-01-01 上海 3351\n",
|
||
"1 2020-01-01 上海 1016\n",
|
||
"2 2020-01-01 上海 6320\n",
|
||
"3 2020-01-01 上海 485\n",
|
||
"4 2020-01-01 上海 2452\n",
|
||
"... ... ... ...\n",
|
||
"1940 2020-12-30 北京 1560\n",
|
||
"1941 2020-12-30 福建 3028\n",
|
||
"1942 2020-12-31 福建 2277\n",
|
||
"1943 2020-12-31 福建 852\n",
|
||
"1944 2020-12-31 福建 435\n",
|
||
"\n",
|
||
"[1945 rows x 3 columns]"
|
||
]
|
||
},
|
||
"execution_count": 29,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问多个列 - 花式索引\n",
|
||
"df6[['销售日期', '销售区域', '直接成本']]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"id": "cf31a169-549e-4182-8206-789f97316115",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Index(['销售订单', '品牌', '售价', '销售数量'], dtype='object')"
|
||
]
|
||
},
|
||
"execution_count": 30,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.columns[3:7]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "792713c0-13bc-4810-86cc-5f6f6ce78719",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>85</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 4 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售订单 品牌 售价 销售数量\n",
|
||
"0 182894-455 八匹马 99 83\n",
|
||
"1 205635-402 八匹马 219 29\n",
|
||
"2 205654-021 八匹马 169 85\n",
|
||
"3 205654-519 八匹马 169 14\n",
|
||
"4 377781-010 皮皮虾 249 61\n",
|
||
"... ... ... ... ...\n",
|
||
"1940 D89677 花花姑娘 269 26\n",
|
||
"1941 182719-050 八匹马 79 97\n",
|
||
"1942 G70083 花花姑娘 269 55\n",
|
||
"1943 211471-902/704 八匹马 59 59\n",
|
||
"1944 211807-050 八匹马 99 27\n",
|
||
"\n",
|
||
"[1945 rows x 4 columns]"
|
||
]
|
||
},
|
||
"execution_count": 31,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6[df6.columns[3:7]]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"id": "02d43b17-15e3-44d5-844b-a50d365bf863",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"销售日期 2020-12-31 00:00:00\n",
|
||
"销售区域 福建\n",
|
||
"销售渠道 天猫\n",
|
||
"销售订单 211807-050\n",
|
||
"品牌 八匹马\n",
|
||
"售价 99\n",
|
||
"销售数量 27\n",
|
||
"直接成本 435\n",
|
||
"Name: 1944, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 32,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问行 - loc属性\n",
|
||
"df6.loc[1944]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"id": "79da6932-f985-44dc-9f4b-e051e4749c65",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"销售日期 2020-12-31 00:00:00\n",
|
||
"销售区域 福建\n",
|
||
"销售渠道 天猫\n",
|
||
"销售订单 211807-050\n",
|
||
"品牌 八匹马\n",
|
||
"售价 99\n",
|
||
"销售数量 27\n",
|
||
"直接成本 435\n",
|
||
"Name: 1944, dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 33,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.iloc[-1]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"id": "6246b39b-7229-4e0f-af7b-0915e707492a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>100</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>529753-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>329</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>1839</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>58</th>\n",
|
||
" <td>2020-01-10</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>AWDH584-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>1495</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>2020-05-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>G71332</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>899</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>35120</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>2020-05-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>G71332</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>899</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>35120</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1000</th>\n",
|
||
" <td>2020-05-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>G71332</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>899</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>35120</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1099</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>G70077</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>329</td>\n",
|
||
" <td>38</td>\n",
|
||
" <td>2266</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351\n",
|
||
"100 2020-01-15 福建 天猫 529753-010 皮皮虾 329 18 1839\n",
|
||
"58 2020-01-10 北京 天猫 AWDH584-1 壁虎 299 14 1495\n",
|
||
"1000 2020-05-29 上海 天猫 G71332 花花姑娘 899 92 35120\n",
|
||
"1000 2020-05-29 上海 天猫 G71332 花花姑娘 899 92 35120\n",
|
||
"1000 2020-05-29 上海 天猫 G71332 花花姑娘 899 92 35120\n",
|
||
"1099 2020-06-17 上海 拼多多 G70077 花花姑娘 329 38 2266"
|
||
]
|
||
},
|
||
"execution_count": 34,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问多行 - 花式索引\n",
|
||
"df6.loc[[0, 100, 58, 1000, 1000, 1000, 1099]]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"id": "77321324-0ca9-4c2e-a792-3c717189cb27",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>101</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>532500-011</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>42</td>\n",
|
||
" <td>2771</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>102</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543179-011</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>10216</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>103</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>543367-077</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>73</td>\n",
|
||
" <td>16161</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>104</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>634872-021</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>179</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>1322</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>105</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>ADLG008-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>65</td>\n",
|
||
" <td>6154</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>196</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>449794-494</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>98</td>\n",
|
||
" <td>9996</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>197</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>543330-063</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>549</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>3581</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>198</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>575088-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>4088</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>199</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>575107-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>449</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>4144</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>200</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>182721-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>3439</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>100 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"101 2020-01-15 福建 天猫 532500-011 皮皮虾 399 42 2771\n",
|
||
"102 2020-01-15 福建 京东 543179-011 皮皮虾 429 92 10216\n",
|
||
"103 2020-01-15 福建 实体 543367-077 皮皮虾 1199 73 16161\n",
|
||
"104 2020-01-15 福建 拼多多 634872-021 皮皮虾 179 46 1322\n",
|
||
"105 2020-01-15 福建 抖音 ADLG008-1 壁虎 239 65 6154\n",
|
||
".. ... ... ... ... ... ... ... ...\n",
|
||
"196 2020-01-26 福建 拼多多 449794-494 皮皮虾 249 98 9996\n",
|
||
"197 2020-01-26 福建 抖音 543330-063 皮皮虾 549 32 3581\n",
|
||
"198 2020-01-26 福建 天猫 575088-010 皮皮虾 399 40 4088\n",
|
||
"199 2020-01-26 福建 天猫 575107-010 皮皮虾 449 32 4144\n",
|
||
"200 2020-01-26 福建 天猫 182721-050 八匹马 99 85 3439\n",
|
||
"\n",
|
||
"[100 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 35,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问多行 - 切片索引\n",
|
||
"df6.loc[101:200]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"id": "5eb250eb-18e0-4181-a37a-dec55c633116",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>101</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>532500-011</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>42</td>\n",
|
||
" <td>2771</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>102</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543179-011</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>10216</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>103</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>543367-077</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>73</td>\n",
|
||
" <td>16161</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>104</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>634872-021</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>179</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>1322</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>105</th>\n",
|
||
" <td>2020-01-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>ADLG008-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>65</td>\n",
|
||
" <td>6154</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>195</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>449794-091</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>78</td>\n",
|
||
" <td>3424</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>196</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>449794-494</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>98</td>\n",
|
||
" <td>9996</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>197</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>543330-063</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>549</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>3581</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>198</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>575088-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>4088</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>199</th>\n",
|
||
" <td>2020-01-26</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>575107-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>449</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>4144</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>99 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"101 2020-01-15 福建 天猫 532500-011 皮皮虾 399 42 2771\n",
|
||
"102 2020-01-15 福建 京东 543179-011 皮皮虾 429 92 10216\n",
|
||
"103 2020-01-15 福建 实体 543367-077 皮皮虾 1199 73 16161\n",
|
||
"104 2020-01-15 福建 拼多多 634872-021 皮皮虾 179 46 1322\n",
|
||
"105 2020-01-15 福建 抖音 ADLG008-1 壁虎 239 65 6154\n",
|
||
".. ... ... ... ... ... ... ... ...\n",
|
||
"195 2020-01-26 福建 实体 449794-091 皮皮虾 249 78 3424\n",
|
||
"196 2020-01-26 福建 拼多多 449794-494 皮皮虾 249 98 9996\n",
|
||
"197 2020-01-26 福建 抖音 543330-063 皮皮虾 549 32 3581\n",
|
||
"198 2020-01-26 福建 天猫 575088-010 皮皮虾 399 40 4088\n",
|
||
"199 2020-01-26 福建 天猫 575107-010 皮皮虾 449 32 4144\n",
|
||
"\n",
|
||
"[99 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# df6[101:200]\n",
|
||
"df6.iloc[101:200]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"id": "f2daddd7-3635-40b1-9416-c1137315948c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1849</th>\n",
|
||
" <td>2020-12-03</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>543458-452</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>229</td>\n",
|
||
" <td>17</td>\n",
|
||
" <td>1041</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1848</th>\n",
|
||
" <td>2020-12-03</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>211894-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>76</td>\n",
|
||
" <td>3844</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1847</th>\n",
|
||
" <td>2020-12-02</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>22</td>\n",
|
||
" <td>731</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1846</th>\n",
|
||
" <td>2020-12-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>158609-477</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>2436</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1845</th>\n",
|
||
" <td>2020-12-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>G89395</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>369</td>\n",
|
||
" <td>92</td>\n",
|
||
" <td>5291</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>100 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560\n",
|
||
"... ... ... ... ... ... ... ... ...\n",
|
||
"1849 2020-12-03 福建 抖音 543458-452 皮皮虾 229 17 1041\n",
|
||
"1848 2020-12-03 福建 实体 211894-021 八匹马 169 76 3844\n",
|
||
"1847 2020-12-02 北京 京东 182894-455 八匹马 99 22 731\n",
|
||
"1846 2020-12-01 北京 天猫 158609-477 八匹马 79 80 2436\n",
|
||
"1845 2020-12-01 北京 天猫 G89395 花花姑娘 369 92 5291\n",
|
||
"\n",
|
||
"[100 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 37,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.iloc[-1:-101:-1]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"id": "9321811f-e62b-4db5-a478-cdc0934f097b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"169"
|
||
]
|
||
},
|
||
"execution_count": 38,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 访问单元格\n",
|
||
"df6.at[2, '售价']"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"id": "bd1670bc-0a13-457f-95f1-352a4d61b3a7",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>999</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351\n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016\n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 999 85 6320\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452\n",
|
||
"... ... ... ... ... ... ... ... ...\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435\n",
|
||
"\n",
|
||
"[1945 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 39,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.at[2, '售价'] = 999\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"id": "7460ef03-3f45-4cc0-99a3-85039c2606b0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>888</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351\n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016\n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 888 85 6320\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452\n",
|
||
"... ... ... ... ... ... ... ... ...\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435\n",
|
||
"\n",
|
||
"[1945 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 40,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.iat[2, -3] = 888\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"id": "34c81da6-f58f-4c36-8596-004266e9374b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>季度</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" <td>8217</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" <td>6351</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>888</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" <td>75480</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" <td>3481</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" <td>2673</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 11 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本 销售额 季度 \\\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351 8217 1 \n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016 6351 1 \n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 888 85 6320 75480 1 \n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1 \n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1 \n",
|
||
"... ... ... ... ... ... ... ... ... ... .. \n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 4 \n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 4 \n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 4 \n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852 3481 4 \n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435 2673 4 \n",
|
||
"\n",
|
||
" 月份 \n",
|
||
"0 1 \n",
|
||
"1 1 \n",
|
||
"2 1 \n",
|
||
"3 1 \n",
|
||
"4 1 \n",
|
||
"... .. \n",
|
||
"1940 12 \n",
|
||
"1941 12 \n",
|
||
"1942 12 \n",
|
||
"1943 12 \n",
|
||
"1944 12 \n",
|
||
"\n",
|
||
"[1945 rows x 11 columns]"
|
||
]
|
||
},
|
||
"execution_count": 41,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 添加列\n",
|
||
"df6['销售额'] = df6['售价'] * df6['销售数量']\n",
|
||
"df6['季度'] = df6['销售日期'].dt.quarter\n",
|
||
"df6['月份'] = df6['销售日期'].dt.month\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"id": "c3c60210-202d-4bd8-8804-1d657746b29c",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 添加行 - 实际工作中基本没有意义"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"id": "6bf78f3d-05a2-4c7a-a0f0-fb6659f1bd6f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182894-455</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>3351</td>\n",
|
||
" <td>8217</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>205635-402</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>29</td>\n",
|
||
" <td>1016</td>\n",
|
||
" <td>6351</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>888</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" <td>75480</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1943</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>211471-902/704</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>59</td>\n",
|
||
" <td>852</td>\n",
|
||
" <td>3481</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1944</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>211807-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>27</td>\n",
|
||
" <td>435</td>\n",
|
||
" <td>2673</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1945 rows × 10 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"0 2020-01-01 上海 拼多多 182894-455 八匹马 99 83 3351 8217 1\n",
|
||
"1 2020-01-01 上海 抖音 205635-402 八匹马 219 29 1016 6351 1\n",
|
||
"2 2020-01-01 上海 天猫 205654-021 八匹马 888 85 6320 75480 1\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1\n",
|
||
"... ... ... ... ... ... ... ... ... ... ..\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 12\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 12\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 12\n",
|
||
"1943 2020-12-31 福建 抖音 211471-902/704 八匹马 59 59 852 3481 12\n",
|
||
"1944 2020-12-31 福建 天猫 211807-050 八匹马 99 27 435 2673 12\n",
|
||
"\n",
|
||
"[1945 rows x 10 columns]"
|
||
]
|
||
},
|
||
"execution_count": 43,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 删除列\n",
|
||
"# inplace=False - 默认设定 - 不修改原对象返回修改后的新对象\n",
|
||
"# inplace=True - 直接修改DataFrame对象不返回新对象 - 方法没有返回值\n",
|
||
"df6.drop(columns=['季度'], inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"id": "cdf8cf10-5193-4c38-8fef-bc3d38a8a0a8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543369-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588685-002</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>AKLH641-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1938</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588682-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1939</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>599007-513</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1939 rows × 10 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1\n",
|
||
"5 2020-01-02 上海 京东 543369-010 皮皮虾 799 68 15203 54332 1\n",
|
||
"6 2020-01-02 上海 拼多多 588685-002 皮皮虾 299 91 8008 27209 1\n",
|
||
"7 2020-01-03 上海 天猫 AKLH641-1 壁虎 239 82 4127 19598 1\n",
|
||
"... ... ... ... ... ... ... ... ... ... ..\n",
|
||
"1938 2020-12-29 北京 拼多多 588682-010 皮皮虾 269 50 4388 13450 12\n",
|
||
"1939 2020-12-29 北京 天猫 599007-513 皮皮虾 349 18 2466 6282 12\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 12\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 12\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1939 rows x 10 columns]"
|
||
]
|
||
},
|
||
"execution_count": 44,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 删除行\n",
|
||
"# df6.drop(index=[0, 1, 2, 100, 1944, 1943])\n",
|
||
"df6.drop(index=[0, 1, 2, 100, 1944, 1943], inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"id": "1ddfe77d-aa92-4d6a-b2db-8469b1222ed3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543369-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588685-002</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>AKLH641-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1938</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588682-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1939</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>599007-513</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1839 rows × 10 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1\n",
|
||
"5 2020-01-02 上海 京东 543369-010 皮皮虾 799 68 15203 54332 1\n",
|
||
"6 2020-01-02 上海 拼多多 588685-002 皮皮虾 299 91 8008 27209 1\n",
|
||
"7 2020-01-03 上海 天猫 AKLH641-1 壁虎 239 82 4127 19598 1\n",
|
||
"... ... ... ... ... ... ... ... ... ... ..\n",
|
||
"1938 2020-12-29 北京 拼多多 588682-010 皮皮虾 269 50 4388 13450 12\n",
|
||
"1939 2020-12-29 北京 天猫 599007-513 皮皮虾 349 18 2466 6282 12\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 12\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 12\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1839 rows x 10 columns]"
|
||
]
|
||
},
|
||
"execution_count": 45,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.drop(index=df6.index[100:200], inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"id": "8020bbb0-740e-496a-9224-fe3495a19c92",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543369-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588685-002</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>AKLH641-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1938</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588682-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1939</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>599007-513</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1940</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1941</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1942</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1839 rows × 10 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 订单号 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"3 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1\n",
|
||
"4 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1\n",
|
||
"5 2020-01-02 上海 京东 543369-010 皮皮虾 799 68 15203 54332 1\n",
|
||
"6 2020-01-02 上海 拼多多 588685-002 皮皮虾 299 91 8008 27209 1\n",
|
||
"7 2020-01-03 上海 天猫 AKLH641-1 壁虎 239 82 4127 19598 1\n",
|
||
"... ... .. ... ... ... ... ... ... ... ..\n",
|
||
"1938 2020-12-29 北京 拼多多 588682-010 皮皮虾 269 50 4388 13450 12\n",
|
||
"1939 2020-12-29 北京 天猫 599007-513 皮皮虾 349 18 2466 6282 12\n",
|
||
"1940 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 12\n",
|
||
"1941 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 12\n",
|
||
"1942 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1839 rows x 10 columns]"
|
||
]
|
||
},
|
||
"execution_count": 46,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 重命名\n",
|
||
"df6.rename(columns={'销售区域': '区域', '销售渠道': '渠道', '销售订单': '订单号'}, inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 47,
|
||
"id": "d028d2be-0944-4b70-a3ea-f7d06cdd458f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-519</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>543369-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588685-002</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>AKLH641-1</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1834</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>588682-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1835</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>599007-513</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1836</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>D89677</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1837</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>182719-050</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1838</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>G70083</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1839 rows × 10 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 订单号 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"0 2020-01-01 上海 天猫 205654-519 八匹马 169 14 485 2366 1\n",
|
||
"1 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452 15189 1\n",
|
||
"2 2020-01-02 上海 京东 543369-010 皮皮虾 799 68 15203 54332 1\n",
|
||
"3 2020-01-02 上海 拼多多 588685-002 皮皮虾 299 91 8008 27209 1\n",
|
||
"4 2020-01-03 上海 天猫 AKLH641-1 壁虎 239 82 4127 19598 1\n",
|
||
"... ... .. ... ... ... ... ... ... ... ..\n",
|
||
"1834 2020-12-29 北京 拼多多 588682-010 皮皮虾 269 50 4388 13450 12\n",
|
||
"1835 2020-12-29 北京 天猫 599007-513 皮皮虾 349 18 2466 6282 12\n",
|
||
"1836 2020-12-30 北京 京东 D89677 花花姑娘 269 26 1560 6994 12\n",
|
||
"1837 2020-12-30 福建 实体 182719-050 八匹马 79 97 3028 7663 12\n",
|
||
"1838 2020-12-31 福建 实体 G70083 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1839 rows x 10 columns]"
|
||
]
|
||
},
|
||
"execution_count": 47,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 重置索引\n",
|
||
"# drop=False - 默认值 - 原来的索引变成一个普通列\n",
|
||
"# drop=True - 原来的索引直接丢弃\n",
|
||
"df6.reset_index(drop=True, inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 48,
|
||
"id": "cb55a518-f4bd-4fac-8554-4353c0798bc6",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>205654-519</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>377781-010</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543369-010</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>588685-002</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLH641-1</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>588682-010</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>599007-513</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D89677</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>182719-050</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G70083</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1839 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"205654-519 2020-01-01 上海 天猫 八匹马 169 14 485 2366 1\n",
|
||
"377781-010 2020-01-01 上海 天猫 皮皮虾 249 61 2452 15189 1\n",
|
||
"543369-010 2020-01-02 上海 京东 皮皮虾 799 68 15203 54332 1\n",
|
||
"588685-002 2020-01-02 上海 拼多多 皮皮虾 299 91 8008 27209 1\n",
|
||
"AKLH641-1 2020-01-03 上海 天猫 壁虎 239 82 4127 19598 1\n",
|
||
"... ... .. ... ... ... ... ... ... ..\n",
|
||
"588682-010 2020-12-29 北京 拼多多 皮皮虾 269 50 4388 13450 12\n",
|
||
"599007-513 2020-12-29 北京 天猫 皮皮虾 349 18 2466 6282 12\n",
|
||
"D89677 2020-12-30 北京 京东 花花姑娘 269 26 1560 6994 12\n",
|
||
"182719-050 2020-12-30 福建 实体 八匹马 79 97 3028 7663 12\n",
|
||
"G70083 2020-12-31 福建 实体 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1839 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 48,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 设置索引\n",
|
||
"df6.set_index('订单号', inplace=True)\n",
|
||
"df6"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 49,
|
||
"id": "101bd804-5a90-4cd3-a545-613df6d9b8e5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G70509</th>\n",
|
||
" <td>2020-02-03</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1499</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>52302</td>\n",
|
||
" <td>133411</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G72186</th>\n",
|
||
" <td>2020-04-11</td>\n",
|
||
" <td>江苏</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>18381</td>\n",
|
||
" <td>114312</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-04-12</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>25674</td>\n",
|
||
" <td>105512</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-06-08</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>29819</td>\n",
|
||
" <td>103920</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>577714-010</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>40884</td>\n",
|
||
" <td>116303</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-08-28</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>45442</td>\n",
|
||
" <td>106711</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-09-19</td>\n",
|
||
" <td>广东</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>93</td>\n",
|
||
" <td>34290</td>\n",
|
||
" <td>120807</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G70509 2020-02-03 北京 拼多多 花花姑娘 1499 89 52302 133411 2\n",
|
||
"G72186 2020-04-11 江苏 天猫 花花姑娘 1299 88 18381 114312 4\n",
|
||
"543367-077 2020-04-12 北京 拼多多 皮皮虾 1199 88 25674 105512 4\n",
|
||
"G68188 2020-06-08 北京 拼多多 花花姑娘 1299 80 29819 103920 6\n",
|
||
"577714-010 2020-06-17 上海 拼多多 皮皮虾 1199 97 40884 116303 6\n",
|
||
"543367-077 2020-08-28 上海 天猫 皮皮虾 1199 89 45442 106711 8\n",
|
||
"G68188 2020-09-19 广东 拼多多 花花姑娘 1299 93 34290 120807 9"
|
||
]
|
||
},
|
||
"execution_count": 49,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 筛选数据 - 布尔索引\n",
|
||
"df6[df6['销售额'] > 100000]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"id": "64c83a43-fcb0-4ba1-9400-ae4a5b21715c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-06-08</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>29819</td>\n",
|
||
" <td>103920</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>577714-010</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>40884</td>\n",
|
||
" <td>116303</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G68188 2020-06-08 北京 拼多多 花花姑娘 1299 80 29819 103920 6\n",
|
||
"577714-010 2020-06-17 上海 拼多多 皮皮虾 1199 97 40884 116303 6"
|
||
]
|
||
},
|
||
"execution_count": 50,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6[(df6['销售额'] > 100000) & (df6['月份'] == 6)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 51,
|
||
"id": "22c01e56-b188-40f7-9e53-3a3d2f0bcb29",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G70509</th>\n",
|
||
" <td>2020-02-03</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1499</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>52302</td>\n",
|
||
" <td>133411</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G72186</th>\n",
|
||
" <td>2020-04-11</td>\n",
|
||
" <td>江苏</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>18381</td>\n",
|
||
" <td>114312</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-04-12</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>25674</td>\n",
|
||
" <td>105512</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>204396-900/021</th>\n",
|
||
" <td>2020-06-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>啊哟喂</td>\n",
|
||
" <td>199</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>4221</td>\n",
|
||
" <td>10945</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AHSJ008-2</th>\n",
|
||
" <td>2020-06-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>139</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>3640</td>\n",
|
||
" <td>8479</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543179-011</th>\n",
|
||
" <td>2020-06-30</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>74</td>\n",
|
||
" <td>11601</td>\n",
|
||
" <td>31746</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLH641-1</th>\n",
|
||
" <td>2020-06-30</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>67</td>\n",
|
||
" <td>3490</td>\n",
|
||
" <td>16013</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>158631-050</th>\n",
|
||
" <td>2020-06-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>1421</td>\n",
|
||
" <td>8811</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-08-28</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>45442</td>\n",
|
||
" <td>106711</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-09-19</td>\n",
|
||
" <td>广东</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>93</td>\n",
|
||
" <td>34290</td>\n",
|
||
" <td>120807</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>152 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G70509 2020-02-03 北京 拼多多 花花姑娘 1499 89 52302 133411 2\n",
|
||
"G72186 2020-04-11 江苏 天猫 花花姑娘 1299 88 18381 114312 4\n",
|
||
"543367-077 2020-04-12 北京 拼多多 皮皮虾 1199 88 25674 105512 4\n",
|
||
"204396-900/021 2020-06-01 北京 拼多多 啊哟喂 199 55 4221 10945 6\n",
|
||
"AHSJ008-2 2020-06-01 北京 天猫 壁虎 139 61 3640 8479 6\n",
|
||
"... ... .. ... ... ... ... ... ... ..\n",
|
||
"543179-011 2020-06-30 上海 京东 皮皮虾 429 74 11601 31746 6\n",
|
||
"AKLH641-1 2020-06-30 上海 实体 壁虎 239 67 3490 16013 6\n",
|
||
"158631-050 2020-06-30 北京 天猫 八匹马 99 89 1421 8811 6\n",
|
||
"543367-077 2020-08-28 上海 天猫 皮皮虾 1199 89 45442 106711 8\n",
|
||
"G68188 2020-09-19 广东 拼多多 花花姑娘 1299 93 34290 120807 9\n",
|
||
"\n",
|
||
"[152 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 51,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6[(df6['销售额'] > 100000) | (df6['月份'] == 6)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"id": "5adb86b9-8b31-49cb-9292-94189f3714c5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G70509</th>\n",
|
||
" <td>2020-02-03</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1499</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>52302</td>\n",
|
||
" <td>133411</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G72186</th>\n",
|
||
" <td>2020-04-11</td>\n",
|
||
" <td>江苏</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>18381</td>\n",
|
||
" <td>114312</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-04-12</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>25674</td>\n",
|
||
" <td>105512</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-06-08</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>29819</td>\n",
|
||
" <td>103920</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>577714-010</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>40884</td>\n",
|
||
" <td>116303</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543367-077</th>\n",
|
||
" <td>2020-08-28</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>89</td>\n",
|
||
" <td>45442</td>\n",
|
||
" <td>106711</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-09-19</td>\n",
|
||
" <td>广东</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>93</td>\n",
|
||
" <td>34290</td>\n",
|
||
" <td>120807</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G70509 2020-02-03 北京 拼多多 花花姑娘 1499 89 52302 133411 2\n",
|
||
"G72186 2020-04-11 江苏 天猫 花花姑娘 1299 88 18381 114312 4\n",
|
||
"543367-077 2020-04-12 北京 拼多多 皮皮虾 1199 88 25674 105512 4\n",
|
||
"G68188 2020-06-08 北京 拼多多 花花姑娘 1299 80 29819 103920 6\n",
|
||
"577714-010 2020-06-17 上海 拼多多 皮皮虾 1199 97 40884 116303 6\n",
|
||
"543367-077 2020-08-28 上海 天猫 皮皮虾 1199 89 45442 106711 8\n",
|
||
"G68188 2020-09-19 广东 拼多多 花花姑娘 1299 93 34290 120807 9"
|
||
]
|
||
},
|
||
"execution_count": 52,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.query('销售额 > 100000')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"id": "b768afa0-7066-4a1d-8f10-b88386587388",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>D86056</th>\n",
|
||
" <td>2020-06-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>469</td>\n",
|
||
" <td>24</td>\n",
|
||
" <td>3445</td>\n",
|
||
" <td>11256</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543179-011</th>\n",
|
||
" <td>2020-06-02</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>58</td>\n",
|
||
" <td>8002</td>\n",
|
||
" <td>24882</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLH651-2</th>\n",
|
||
" <td>2020-06-04</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>78</td>\n",
|
||
" <td>3577</td>\n",
|
||
" <td>23322</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>F89396</th>\n",
|
||
" <td>2020-06-07</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>199</td>\n",
|
||
" <td>93</td>\n",
|
||
" <td>7370</td>\n",
|
||
" <td>18507</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>X23567</th>\n",
|
||
" <td>2020-06-09</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>6484</td>\n",
|
||
" <td>19734</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G71183</th>\n",
|
||
" <td>2020-06-10</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>369</td>\n",
|
||
" <td>93</td>\n",
|
||
" <td>9247</td>\n",
|
||
" <td>34317</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D89458</th>\n",
|
||
" <td>2020-06-11</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6379</td>\n",
|
||
" <td>25415</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLJ034-3</th>\n",
|
||
" <td>2020-06-12</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>81</td>\n",
|
||
" <td>8048</td>\n",
|
||
" <td>19359</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AHSJ017-3</th>\n",
|
||
" <td>2020-06-13</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>139</td>\n",
|
||
" <td>96</td>\n",
|
||
" <td>5892</td>\n",
|
||
" <td>13344</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>182802-050</th>\n",
|
||
" <td>2020-06-15</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>199</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1760</td>\n",
|
||
" <td>5174</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G70260</th>\n",
|
||
" <td>2020-06-15</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>329</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>1491</td>\n",
|
||
" <td>4935</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>FT001-18-1763</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>699</td>\n",
|
||
" <td>98</td>\n",
|
||
" <td>25835</td>\n",
|
||
" <td>68502</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>182717-001</th>\n",
|
||
" <td>2020-06-18</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>69</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>255</td>\n",
|
||
" <td>690</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>158631-050</th>\n",
|
||
" <td>2020-06-20</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>66</td>\n",
|
||
" <td>2670</td>\n",
|
||
" <td>6534</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D87692</th>\n",
|
||
" <td>2020-06-22</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>5058</td>\n",
|
||
" <td>32718</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>158636-050</th>\n",
|
||
" <td>2020-06-25</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>119</td>\n",
|
||
" <td>22</td>\n",
|
||
" <td>781</td>\n",
|
||
" <td>2618</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G80825</th>\n",
|
||
" <td>2020-06-28</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>84</td>\n",
|
||
" <td>12260</td>\n",
|
||
" <td>33516</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>X12399</th>\n",
|
||
" <td>2020-06-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>329</td>\n",
|
||
" <td>83</td>\n",
|
||
" <td>4926</td>\n",
|
||
" <td>27307</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLH641-1</th>\n",
|
||
" <td>2020-06-30</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>67</td>\n",
|
||
" <td>3490</td>\n",
|
||
" <td>16013</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"D86056 2020-06-01 北京 实体 花花姑娘 469 24 3445 11256 6\n",
|
||
"543179-011 2020-06-02 北京 实体 皮皮虾 429 58 8002 24882 6\n",
|
||
"AKLH651-2 2020-06-04 北京 实体 壁虎 299 78 3577 23322 6\n",
|
||
"F89396 2020-06-07 福建 实体 花花姑娘 199 93 7370 18507 6\n",
|
||
"X23567 2020-06-09 上海 实体 花花姑娘 429 46 6484 19734 6\n",
|
||
"G71183 2020-06-10 北京 实体 花花姑娘 369 93 9247 34317 6\n",
|
||
"D89458 2020-06-11 北京 实体 花花姑娘 299 85 6379 25415 6\n",
|
||
"AKLJ034-3 2020-06-12 福建 实体 壁虎 239 81 8048 19359 6\n",
|
||
"AHSJ017-3 2020-06-13 福建 实体 壁虎 139 96 5892 13344 6\n",
|
||
"182802-050 2020-06-15 上海 实体 八匹马 199 26 1760 5174 6\n",
|
||
"G70260 2020-06-15 福建 实体 花花姑娘 329 15 1491 4935 6\n",
|
||
"FT001-18-1763 2020-06-17 上海 实体 八匹马 699 98 25835 68502 6\n",
|
||
"182717-001 2020-06-18 北京 实体 八匹马 69 10 255 690 6\n",
|
||
"158631-050 2020-06-20 福建 实体 八匹马 99 66 2670 6534 6\n",
|
||
"D87692 2020-06-22 福建 实体 花花姑娘 399 82 5058 32718 6\n",
|
||
"158636-050 2020-06-25 北京 实体 八匹马 119 22 781 2618 6\n",
|
||
"G80825 2020-06-28 北京 实体 花花姑娘 399 84 12260 33516 6\n",
|
||
"X12399 2020-06-29 上海 实体 花花姑娘 329 83 4926 27307 6\n",
|
||
"AKLH641-1 2020-06-30 上海 实体 壁虎 239 67 3490 16013 6"
|
||
]
|
||
},
|
||
"execution_count": 53,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.query('月份 == 6 and 渠道 == \"实体\"')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 54,
|
||
"id": "2e57b21c-0565-4352-8924-de169497bce0",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G68188</th>\n",
|
||
" <td>2020-06-08</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>1299</td>\n",
|
||
" <td>80</td>\n",
|
||
" <td>29819</td>\n",
|
||
" <td>103920</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>577714-010</th>\n",
|
||
" <td>2020-06-17</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>40884</td>\n",
|
||
" <td>116303</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G68188 2020-06-08 北京 拼多多 花花姑娘 1299 80 29819 103920 6\n",
|
||
"577714-010 2020-06-17 上海 拼多多 皮皮虾 1199 97 40884 116303 6"
|
||
]
|
||
},
|
||
"execution_count": 54,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.query('销售额 > 100000 and 月份 == 6')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 55,
|
||
"id": "7ef8ba56-5293-41b0-8208-85a0eed735e8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>205333-031</th>\n",
|
||
" <td>2020-12-21</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>98</td>\n",
|
||
" <td>6150</td>\n",
|
||
" <td>16562</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>F76717</th>\n",
|
||
" <td>2020-06-24</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>2403</td>\n",
|
||
" <td>6435</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>577714-010</th>\n",
|
||
" <td>2020-02-01</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>1199</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>22707</td>\n",
|
||
" <td>65945</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>F45562</th>\n",
|
||
" <td>2020-04-16</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>599</td>\n",
|
||
" <td>90</td>\n",
|
||
" <td>14111</td>\n",
|
||
" <td>53910</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>211466-901/519</th>\n",
|
||
" <td>2020-01-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>199</td>\n",
|
||
" <td>52</td>\n",
|
||
" <td>4651</td>\n",
|
||
" <td>10348</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>F76716</th>\n",
|
||
" <td>2020-04-20</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>429</td>\n",
|
||
" <td>70</td>\n",
|
||
" <td>11772</td>\n",
|
||
" <td>30030</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G69627</th>\n",
|
||
" <td>2020-06-09</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>999</td>\n",
|
||
" <td>36</td>\n",
|
||
" <td>14206</td>\n",
|
||
" <td>35964</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>588670-010</th>\n",
|
||
" <td>2020-02-15</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>499</td>\n",
|
||
" <td>75</td>\n",
|
||
" <td>15335</td>\n",
|
||
" <td>37425</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D86041</th>\n",
|
||
" <td>2020-03-11</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>5490</td>\n",
|
||
" <td>15960</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>204266-050</th>\n",
|
||
" <td>2020-09-26</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>啊哟喂</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>4116</td>\n",
|
||
" <td>10994</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>100 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"205333-031 2020-12-21 北京 京东 八匹马 169 98 6150 16562 12\n",
|
||
"F76717 2020-06-24 福建 天猫 花花姑娘 429 15 2403 6435 6\n",
|
||
"577714-010 2020-02-01 北京 天猫 皮皮虾 1199 55 22707 65945 2\n",
|
||
"F45562 2020-04-16 福建 天猫 花花姑娘 599 90 14111 53910 4\n",
|
||
"211466-901/519 2020-01-29 上海 天猫 八匹马 199 52 4651 10348 1\n",
|
||
"... ... .. .. ... ... ... ... ... ..\n",
|
||
"F76716 2020-04-20 上海 天猫 花花姑娘 429 70 11772 30030 4\n",
|
||
"G69627 2020-06-09 北京 天猫 花花姑娘 999 36 14206 35964 6\n",
|
||
"588670-010 2020-02-15 上海 抖音 皮皮虾 499 75 15335 37425 2\n",
|
||
"D86041 2020-03-11 上海 天猫 花花姑娘 399 40 5490 15960 3\n",
|
||
"204266-050 2020-09-26 上海 抖音 啊哟喂 239 46 4116 10994 9\n",
|
||
"\n",
|
||
"[100 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 55,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 随机抽样\n",
|
||
"df6.sample(n=100)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 56,
|
||
"id": "bfcd52d7-eac4-4776-b0e3-a37e67e349f3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>G74904</th>\n",
|
||
" <td>2020-08-18</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>499</td>\n",
|
||
" <td>88</td>\n",
|
||
" <td>15952</td>\n",
|
||
" <td>43912</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D89096</th>\n",
|
||
" <td>2020-03-29</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>73</td>\n",
|
||
" <td>13022</td>\n",
|
||
" <td>29127</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>F89399</th>\n",
|
||
" <td>2020-02-14</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>499</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>5758</td>\n",
|
||
" <td>22954</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D87692</th>\n",
|
||
" <td>2020-10-29</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>399</td>\n",
|
||
" <td>48</td>\n",
|
||
" <td>8074</td>\n",
|
||
" <td>19152</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>205301-477</th>\n",
|
||
" <td>2020-11-24</td>\n",
|
||
" <td>广东</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>199</td>\n",
|
||
" <td>47</td>\n",
|
||
" <td>1762</td>\n",
|
||
" <td>9353</td>\n",
|
||
" <td>11</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>543369-010</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>799</td>\n",
|
||
" <td>68</td>\n",
|
||
" <td>15203</td>\n",
|
||
" <td>54332</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G72212</th>\n",
|
||
" <td>2020-07-07</td>\n",
|
||
" <td>江苏</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>899</td>\n",
|
||
" <td>45</td>\n",
|
||
" <td>6922</td>\n",
|
||
" <td>40455</td>\n",
|
||
" <td>7</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>479935-012</th>\n",
|
||
" <td>2020-08-02</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>25</td>\n",
|
||
" <td>2098</td>\n",
|
||
" <td>8725</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>480239-010</th>\n",
|
||
" <td>2020-09-02</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>36</td>\n",
|
||
" <td>4384</td>\n",
|
||
" <td>10764</td>\n",
|
||
" <td>9</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AWDH721-2</th>\n",
|
||
" <td>2020-07-04</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>4597</td>\n",
|
||
" <td>10491</td>\n",
|
||
" <td>7</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>92 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"G74904 2020-08-18 北京 京东 花花姑娘 499 88 15952 43912 8\n",
|
||
"D89096 2020-03-29 上海 拼多多 花花姑娘 399 73 13022 29127 3\n",
|
||
"F89399 2020-02-14 上海 抖音 花花姑娘 499 46 5758 22954 2\n",
|
||
"D87692 2020-10-29 福建 拼多多 花花姑娘 399 48 8074 19152 10\n",
|
||
"205301-477 2020-11-24 广东 京东 八匹马 199 47 1762 9353 11\n",
|
||
"... ... .. ... ... ... ... ... ... ..\n",
|
||
"543369-010 2020-01-02 上海 京东 皮皮虾 799 68 15203 54332 1\n",
|
||
"G72212 2020-07-07 江苏 天猫 花花姑娘 899 45 6922 40455 7\n",
|
||
"479935-012 2020-08-02 北京 京东 皮皮虾 349 25 2098 8725 8\n",
|
||
"480239-010 2020-09-02 福建 抖音 皮皮虾 299 36 4384 10764 9\n",
|
||
"AWDH721-2 2020-07-04 上海 抖音 壁虎 269 39 4597 10491 7\n",
|
||
"\n",
|
||
"[92 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 56,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df6.sample(frac=0.05)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 57,
|
||
"id": "1c654ca8-3179-4fa2-9213-7d7029357342",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>销售区域</th>\n",
|
||
" <th>销售渠道</th>\n",
|
||
" <th>销售订单</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>205654-021</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>85</td>\n",
|
||
" <td>6320</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>377781-010</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>FT001-N10</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>699</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>8380</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>2020-01-04</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>FT001-N10</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>699</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>2635</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>2020-01-06</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>G70357</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>699</td>\n",
|
||
" <td>49</td>\n",
|
||
" <td>8809</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>190</th>\n",
|
||
" <td>2020-12-05</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>G69924</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>599</td>\n",
|
||
" <td>75</td>\n",
|
||
" <td>7057</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>191</th>\n",
|
||
" <td>2020-12-07</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>182898-258</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>99</td>\n",
|
||
" <td>2506</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>192</th>\n",
|
||
" <td>2020-12-10</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>抖音</td>\n",
|
||
" <td>AKLJ041-2</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>42</td>\n",
|
||
" <td>1746</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>193</th>\n",
|
||
" <td>2020-12-21</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>205333-031</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>98</td>\n",
|
||
" <td>6150</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>194</th>\n",
|
||
" <td>2020-12-24</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>D88376</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>32</td>\n",
|
||
" <td>2006</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>195 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 销售区域 销售渠道 销售订单 品牌 售价 销售数量 直接成本\n",
|
||
"0 2020-01-01 上海 天猫 205654-021 八匹马 169 85 6320\n",
|
||
"1 2020-01-01 上海 天猫 377781-010 皮皮虾 249 61 2452\n",
|
||
"2 2020-01-03 上海 天猫 FT001-N10 八匹马 699 50 8380\n",
|
||
"3 2020-01-04 上海 实体 FT001-N10 八匹马 699 15 2635\n",
|
||
"4 2020-01-06 上海 抖音 G70357 花花姑娘 699 49 8809\n",
|
||
".. ... ... ... ... ... ... ... ...\n",
|
||
"190 2020-12-05 福建 抖音 G69924 花花姑娘 599 75 7057\n",
|
||
"191 2020-12-07 福建 拼多多 182898-258 八匹马 99 99 2506\n",
|
||
"192 2020-12-10 北京 抖音 AKLJ041-2 壁虎 269 42 1746\n",
|
||
"193 2020-12-21 北京 京东 205333-031 八匹马 169 98 6150\n",
|
||
"194 2020-12-24 福建 实体 D88376 花花姑娘 269 32 2006\n",
|
||
"\n",
|
||
"[195 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 57,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# replace=False - 无放回抽样\n",
|
||
"ignore_rows = np.random.choice(np.arange(1, 1946), size=int(1945 * 0.9), replace=False)\n",
|
||
"pd.read_excel(\n",
|
||
" 'res/2020年销售数据.xlsx',\n",
|
||
" sheet_name='data',\n",
|
||
" skiprows=ignore_rows\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2037ed6a-d616-4c67-9f5d-ea517d6e1c6b",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 数据重塑\n",
|
||
"\n",
|
||
"1. 拼接(合并结构一致的数据)\n",
|
||
"2. 合并(事实表连接维度表)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 58,
|
||
"id": "d2184fd4-bd44-459f-bda4-6dc11c09c219",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(19, 6)"
|
||
]
|
||
},
|
||
"execution_count": 58,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 拼接两个DataFrame - union\n",
|
||
"all_emp_df = pd.concat([emp_df1, emp_df2])\n",
|
||
"all_emp_df.shape"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 59,
|
||
"id": "05bc65a1-42ac-463c-a089-08fb8dc60855",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>8000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344.0 1800 200.0 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800.0 5000 1500.0 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056.0 3500 800.0 20 研发部 成都\n",
|
||
"3 张无忌 程序员 2056.0 3200 NaN 20 研发部 成都\n",
|
||
"4 丘处机 程序员 2056.0 3400 NaN 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3088.0 3200 NaN 20 研发部 成都\n",
|
||
"6 张翠山 程序员 2056.0 4000 NaN 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800.0 3000 800.0 30 销售部 重庆\n",
|
||
"8 杨过 会计 5566.0 2200 NaN 10 会计部 北京\n",
|
||
"9 朱九真 会计 5566.0 2500 NaN 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 3344.0 2500 NaN 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 5566.0 2000 NaN 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 7800.0 4000 1000.0 10 会计部 北京\n",
|
||
"13 张三丰 总裁 NaN 9000 1200.0 20 研发部 成都\n",
|
||
"14 张三丰 总裁 NaN 50000 8000.0 20 研发部 成都\n",
|
||
"15 王大锤 程序员 9800.0 8000 600.0 20 研发部 成都\n",
|
||
"16 张三丰 总裁 NaN 60000 6000.0 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800.0 30000 5000.0 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800.0 10000 1200.0 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 59,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 连表 - 连接事实表和维度表 - 用维度把数据分组然后再做聚合\n",
|
||
"# 连接两个DataFrame(内连接、左外连接、右外连接、全外连接)- join\n",
|
||
"# how - 连表方式 - inner、left、right、outer\n",
|
||
"# on - 基于哪个字段连表 - left_on、right_on\n",
|
||
"all_emp_df = pd.merge(all_emp_df, dept_df, how='inner', on='dno')\n",
|
||
"all_emp_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 60,
|
||
"id": "c6a3d52d-a04c-494d-9ee9-2dad9805b1c1",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 作业:在jobs目录下有若干个CVS文件,它们的数据结构是一样的,现在需要把所有CSV文件的数据拼接到一个DataFrame中\n",
|
||
"import os\n",
|
||
"\n",
|
||
"dfs = [pd.read_csv(os.path.join('res/jobs', filename))\n",
|
||
" for filename in os.listdir('res/jobs') \n",
|
||
" if filename.endswith('.csv')]\n",
|
||
"pd.concat(dfs, ignore_index=True).to_csv('res/all_jobs.csv', index=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6b9ad1e1-fe5d-45a0-8755-ac6720a32ba0",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 数据清洗\n",
|
||
"\n",
|
||
"1. 缺失值\n",
|
||
"2. 重复值\n",
|
||
"3. 异常值\n",
|
||
"4. 预处理"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 61,
|
||
"id": "45c835c4-559f-45f1-a501-70a8c12bbbb1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>True</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" <td>False</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 False False False False False False False False\n",
|
||
"1 False False False False False False False False\n",
|
||
"2 False False False False False False False False\n",
|
||
"3 False False False False True False False False\n",
|
||
"4 False False False False True False False False\n",
|
||
"5 False False False False True False False False\n",
|
||
"6 False False False False True False False False\n",
|
||
"7 False False False False False False False False\n",
|
||
"8 False False False False True False False False\n",
|
||
"9 False False False False True False False False\n",
|
||
"10 False False False False True False False False\n",
|
||
"11 False False False False True False False False\n",
|
||
"12 False False False False False False False False\n",
|
||
"13 False False True False False False False False\n",
|
||
"14 False False True False False False False False\n",
|
||
"15 False False False False False False False False\n",
|
||
"16 False False True False False False False False\n",
|
||
"17 False False False False False False False False\n",
|
||
"18 False False False False False False False False"
|
||
]
|
||
},
|
||
"execution_count": 61,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 甄别缺失值\n",
|
||
"all_emp_df.isna()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 62,
|
||
"id": "fd7fbdf8-ebf2-463b-ac3b-cdb24560873a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 False\n",
|
||
"1 False\n",
|
||
"2 False\n",
|
||
"3 True\n",
|
||
"4 True\n",
|
||
"5 True\n",
|
||
"6 True\n",
|
||
"7 False\n",
|
||
"8 True\n",
|
||
"9 True\n",
|
||
"10 True\n",
|
||
"11 True\n",
|
||
"12 False\n",
|
||
"13 False\n",
|
||
"14 False\n",
|
||
"15 False\n",
|
||
"16 False\n",
|
||
"17 False\n",
|
||
"18 False\n",
|
||
"Name: comm, dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 62,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# all_emp_df['comm'].isna()\n",
|
||
"all_emp_df['comm'].isnull()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 63,
|
||
"id": "a4f16d30-83e9-4761-92a1-780e85e721e1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 True\n",
|
||
"1 True\n",
|
||
"2 True\n",
|
||
"3 False\n",
|
||
"4 False\n",
|
||
"5 False\n",
|
||
"6 False\n",
|
||
"7 True\n",
|
||
"8 False\n",
|
||
"9 False\n",
|
||
"10 False\n",
|
||
"11 False\n",
|
||
"12 True\n",
|
||
"13 True\n",
|
||
"14 True\n",
|
||
"15 True\n",
|
||
"16 True\n",
|
||
"17 True\n",
|
||
"18 True\n",
|
||
"Name: comm, dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 63,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# all_emp_df['comm'].notna()\n",
|
||
"all_emp_df['comm'].notnull()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 64,
|
||
"id": "9f2a153d-ab4a-475e-9ee3-0d623a289f7f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"comm\n",
|
||
"True 11\n",
|
||
"False 8\n",
|
||
"Name: count, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 64,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df['comm'].notna().value_counts()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 65,
|
||
"id": "5d388d57-fa1a-405b-880e-9316354a6f05",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344.0 1800 200.0 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800.0 5000 1500.0 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056.0 3500 800.0 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800.0 3000 800.0 30 销售部 重庆\n",
|
||
"12 宋远桥 会计师 7800.0 4000 1000.0 10 会计部 北京\n",
|
||
"15 王大锤 程序员 9800.0 8000 600.0 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800.0 30000 5000.0 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800.0 10000 1200.0 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 65,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 删除空值 - 删除带有空值的行\n",
|
||
"all_emp_df.dropna()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 66,
|
||
"id": "b40fa037-3fab-454e-a300-2e9dcf4b2b60",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job sal dno dname dloc\n",
|
||
"0 胡一刀 销售员 1800 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 5000 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 3500 20 研发部 成都\n",
|
||
"3 张无忌 程序员 3200 20 研发部 成都\n",
|
||
"4 丘处机 程序员 3400 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3200 20 研发部 成都\n",
|
||
"6 张翠山 程序员 4000 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 3000 30 销售部 重庆\n",
|
||
"8 杨过 会计 2200 10 会计部 北京\n",
|
||
"9 朱九真 会计 2500 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 2500 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 2000 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 4000 10 会计部 北京\n",
|
||
"13 张三丰 总裁 9000 20 研发部 成都\n",
|
||
"14 张三丰 总裁 50000 20 研发部 成都\n",
|
||
"15 王大锤 程序员 8000 20 研发部 成都\n",
|
||
"16 张三丰 总裁 60000 20 研发部 成都\n",
|
||
"17 骆昊 架构师 30000 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 10000 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 66,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df.dropna(axis=1)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 67,
|
||
"id": "67ae21a1-7dc1-496b-85b5-013d79d25a63",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 3344.0\n",
|
||
"1 7800.0\n",
|
||
"2 2056.0\n",
|
||
"3 2056.0\n",
|
||
"4 2056.0\n",
|
||
"5 3088.0\n",
|
||
"6 2056.0\n",
|
||
"7 7800.0\n",
|
||
"8 5566.0\n",
|
||
"9 5566.0\n",
|
||
"10 3344.0\n",
|
||
"11 5566.0\n",
|
||
"12 7800.0\n",
|
||
"15 9800.0\n",
|
||
"17 7800.0\n",
|
||
"18 9800.0\n",
|
||
"Name: mgr, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 67,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df.mgr.dropna()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 68,
|
||
"id": "66745379-9db7-42b0-ab6b-a55e870a515b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000.0</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>8000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>0.0</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200.0</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344.0 1800 200.0 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800.0 5000 1500.0 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056.0 3500 800.0 20 研发部 成都\n",
|
||
"3 张无忌 程序员 2056.0 3200 0.0 20 研发部 成都\n",
|
||
"4 丘处机 程序员 2056.0 3400 0.0 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3088.0 3200 0.0 20 研发部 成都\n",
|
||
"6 张翠山 程序员 2056.0 4000 0.0 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800.0 3000 800.0 30 销售部 重庆\n",
|
||
"8 杨过 会计 5566.0 2200 0.0 10 会计部 北京\n",
|
||
"9 朱九真 会计 5566.0 2500 0.0 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 3344.0 2500 0.0 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 5566.0 2000 0.0 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 7800.0 4000 1000.0 10 会计部 北京\n",
|
||
"13 张三丰 总裁 0.0 9000 1200.0 20 研发部 成都\n",
|
||
"14 张三丰 总裁 0.0 50000 8000.0 20 研发部 成都\n",
|
||
"15 王大锤 程序员 9800.0 8000 600.0 20 研发部 成都\n",
|
||
"16 张三丰 总裁 0.0 60000 6000.0 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800.0 30000 5000.0 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800.0 10000 1200.0 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 68,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 填充空值\n",
|
||
"all_emp_df.fillna(0)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 69,
|
||
"id": "e1664115-30e0-4946-ae4b-c919bb319ddc",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 200\n",
|
||
"1 1500\n",
|
||
"2 800\n",
|
||
"3 0\n",
|
||
"4 0\n",
|
||
"5 0\n",
|
||
"6 0\n",
|
||
"7 800\n",
|
||
"8 0\n",
|
||
"9 0\n",
|
||
"10 0\n",
|
||
"11 0\n",
|
||
"12 1000\n",
|
||
"13 1200\n",
|
||
"14 8000\n",
|
||
"15 600\n",
|
||
"16 6000\n",
|
||
"17 5000\n",
|
||
"18 1200\n",
|
||
"Name: comm, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 69,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df.comm.fillna(0).astype('i8')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 70,
|
||
"id": "e1743531-66a2-42a4-8c28-ad268efc848c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 200.0\n",
|
||
"1 1500.0\n",
|
||
"2 800.0\n",
|
||
"3 800.0\n",
|
||
"4 800.0\n",
|
||
"5 800.0\n",
|
||
"6 800.0\n",
|
||
"7 800.0\n",
|
||
"8 1000.0\n",
|
||
"9 1000.0\n",
|
||
"10 1000.0\n",
|
||
"11 1000.0\n",
|
||
"12 1000.0\n",
|
||
"13 1200.0\n",
|
||
"14 8000.0\n",
|
||
"15 600.0\n",
|
||
"16 6000.0\n",
|
||
"17 5000.0\n",
|
||
"18 1200.0\n",
|
||
"Name: comm, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 70,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将空值下方的非空值向上填充 - backward fill\n",
|
||
"all_emp_df.comm.bfill()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 71,
|
||
"id": "5fcef0a0-ff29-42bd-9955-5a97595390fd",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 200.0\n",
|
||
"1 1500.0\n",
|
||
"2 800.0\n",
|
||
"3 800.0\n",
|
||
"4 800.0\n",
|
||
"5 800.0\n",
|
||
"6 800.0\n",
|
||
"7 800.0\n",
|
||
"8 800.0\n",
|
||
"9 800.0\n",
|
||
"10 800.0\n",
|
||
"11 800.0\n",
|
||
"12 1000.0\n",
|
||
"13 1200.0\n",
|
||
"14 8000.0\n",
|
||
"15 600.0\n",
|
||
"16 6000.0\n",
|
||
"17 5000.0\n",
|
||
"18 1200.0\n",
|
||
"Name: comm, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 71,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将空值上方的非空值向下填充 - forward fill\n",
|
||
"all_emp_df.comm.ffill()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 72,
|
||
"id": "eeeb9be3-802c-44e3-80a0-465aba1a485a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 通过插值算法填充空值 - interpolate\n",
|
||
"all_emp_df['comm'] = all_emp_df.comm.interpolate(method='linear')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 73,
|
||
"id": "f1f094c3-1cc2-4826-a04a-24150ea9cef8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088.0</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>840</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>880</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344.0</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>920</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566.0</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>960</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800.0</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800.0</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344.0 1800 200 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800.0 5000 1500 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056.0 3500 800 20 研发部 成都\n",
|
||
"3 张无忌 程序员 2056.0 3200 800 20 研发部 成都\n",
|
||
"4 丘处机 程序员 2056.0 3400 800 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3088.0 3200 800 20 研发部 成都\n",
|
||
"6 张翠山 程序员 2056.0 4000 800 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800.0 3000 800 30 销售部 重庆\n",
|
||
"8 杨过 会计 5566.0 2200 840 10 会计部 北京\n",
|
||
"9 朱九真 会计 5566.0 2500 880 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 3344.0 2500 920 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 5566.0 2000 960 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 7800.0 4000 1000 10 会计部 北京\n",
|
||
"13 张三丰 总裁 NaN 9000 1200 20 研发部 成都\n",
|
||
"14 张三丰 总裁 NaN 50000 8000 20 研发部 成都\n",
|
||
"15 王大锤 程序员 9800.0 8000 600 20 研发部 成都\n",
|
||
"16 张三丰 总裁 NaN 60000 6000 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800.0 30000 5000 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800.0 10000 1200 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 73,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df['comm'] = all_emp_df.comm.astype('i8')\n",
|
||
"all_emp_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 74,
|
||
"id": "a739242d-ebd2-42d2-9ec7-9a5939cbf74a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>840</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>880</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>920</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>960</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>13</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>9000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>14</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>50000</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344 1800 200 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800 5000 1500 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056 3500 800 20 研发部 成都\n",
|
||
"3 张无忌 程序员 2056 3200 800 20 研发部 成都\n",
|
||
"4 丘处机 程序员 2056 3400 800 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3088 3200 800 20 研发部 成都\n",
|
||
"6 张翠山 程序员 2056 4000 800 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800 3000 800 30 销售部 重庆\n",
|
||
"8 杨过 会计 5566 2200 840 10 会计部 北京\n",
|
||
"9 朱九真 会计 5566 2500 880 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 3344 2500 920 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 5566 2000 960 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 7800 4000 1000 10 会计部 北京\n",
|
||
"13 张三丰 总裁 -1 9000 1200 20 研发部 成都\n",
|
||
"14 张三丰 总裁 -1 50000 8000 20 研发部 成都\n",
|
||
"15 王大锤 程序员 9800 8000 600 20 研发部 成都\n",
|
||
"16 张三丰 总裁 -1 60000 6000 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800 30000 5000 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800 10000 1200 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 74,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df['mgr'] = all_emp_df.mgr.fillna(-1).astype('i8')\n",
|
||
"all_emp_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 75,
|
||
"id": "cd376d13-2245-48b8-ba14-3315d4c48f9c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 False\n",
|
||
"1 False\n",
|
||
"2 False\n",
|
||
"3 False\n",
|
||
"4 False\n",
|
||
"5 False\n",
|
||
"6 False\n",
|
||
"7 False\n",
|
||
"8 False\n",
|
||
"9 False\n",
|
||
"10 False\n",
|
||
"11 False\n",
|
||
"12 False\n",
|
||
"13 False\n",
|
||
"14 True\n",
|
||
"15 False\n",
|
||
"16 True\n",
|
||
"17 False\n",
|
||
"18 False\n",
|
||
"Name: ename, dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 75,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 甄别重复值\n",
|
||
"all_emp_df.ename.duplicated()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 76,
|
||
"id": "6e107a38-c5e8-4e5e-9e42-71481c54e0d1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"0 False\n",
|
||
"1 False\n",
|
||
"2 False\n",
|
||
"3 False\n",
|
||
"4 False\n",
|
||
"5 False\n",
|
||
"6 False\n",
|
||
"7 False\n",
|
||
"8 False\n",
|
||
"9 False\n",
|
||
"10 False\n",
|
||
"11 False\n",
|
||
"12 False\n",
|
||
"13 False\n",
|
||
"14 True\n",
|
||
"15 False\n",
|
||
"16 True\n",
|
||
"17 False\n",
|
||
"18 False\n",
|
||
"dtype: bool"
|
||
]
|
||
},
|
||
"execution_count": 76,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df.duplicated(['ename', 'job'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 77,
|
||
"id": "097eaaf2-1112-4e0f-b361-786bf91d6c1f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"ename\n",
|
||
"张三丰 3\n",
|
||
"胡一刀 1\n",
|
||
"朱九真 1\n",
|
||
"骆昊 1\n",
|
||
"王大锤 1\n",
|
||
"宋远桥 1\n",
|
||
"郭靖 1\n",
|
||
"苗人凤 1\n",
|
||
"杨过 1\n",
|
||
"乔峰 1\n",
|
||
"黄蓉 1\n",
|
||
"张翠山 1\n",
|
||
"欧阳锋 1\n",
|
||
"丘处机 1\n",
|
||
"张无忌 1\n",
|
||
"李莫愁 1\n",
|
||
"陈小刀 1\n",
|
||
"Name: count, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 77,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 统计每个元素出现的频次\n",
|
||
"all_emp_df.ename.value_counts()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 78,
|
||
"id": "6494bb56-7ac7-47df-a9f1-960b02586e31",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"job\n",
|
||
"程序员 5\n",
|
||
"总裁 3\n",
|
||
"销售员 2\n",
|
||
"分析师 2\n",
|
||
"会计 2\n",
|
||
"设计师 1\n",
|
||
"销售主管 1\n",
|
||
"出纳 1\n",
|
||
"会计师 1\n",
|
||
"架构师 1\n",
|
||
"Name: count, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 78,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"all_emp_df.job.value_counts()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 79,
|
||
"id": "172e4d9a-63bd-44ca-98ea-e4614c8823ab",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"17"
|
||
]
|
||
},
|
||
"execution_count": 79,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 统计不重复的元素的个数\n",
|
||
"all_emp_df.ename.nunique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 80,
|
||
"id": "d6fa062c-d338-407f-8647-e84878a5642e",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ename</th>\n",
|
||
" <th>job</th>\n",
|
||
" <th>mgr</th>\n",
|
||
" <th>sal</th>\n",
|
||
" <th>comm</th>\n",
|
||
" <th>dno</th>\n",
|
||
" <th>dname</th>\n",
|
||
" <th>dloc</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>胡一刀</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344</td>\n",
|
||
" <td>1800</td>\n",
|
||
" <td>200</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>乔峰</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>1500</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>李莫愁</td>\n",
|
||
" <td>设计师</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3500</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>张无忌</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>丘处机</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>3400</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>欧阳锋</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>3088</td>\n",
|
||
" <td>3200</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>张翠山</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>2056</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>黄蓉</td>\n",
|
||
" <td>销售主管</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>3000</td>\n",
|
||
" <td>800</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>杨过</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2200</td>\n",
|
||
" <td>840</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>朱九真</td>\n",
|
||
" <td>会计</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>880</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>苗人凤</td>\n",
|
||
" <td>销售员</td>\n",
|
||
" <td>3344</td>\n",
|
||
" <td>2500</td>\n",
|
||
" <td>920</td>\n",
|
||
" <td>30</td>\n",
|
||
" <td>销售部</td>\n",
|
||
" <td>重庆</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11</th>\n",
|
||
" <td>郭靖</td>\n",
|
||
" <td>出纳</td>\n",
|
||
" <td>5566</td>\n",
|
||
" <td>2000</td>\n",
|
||
" <td>960</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>12</th>\n",
|
||
" <td>宋远桥</td>\n",
|
||
" <td>会计师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>4000</td>\n",
|
||
" <td>1000</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>会计部</td>\n",
|
||
" <td>北京</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>15</th>\n",
|
||
" <td>王大锤</td>\n",
|
||
" <td>程序员</td>\n",
|
||
" <td>9800</td>\n",
|
||
" <td>8000</td>\n",
|
||
" <td>600</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>16</th>\n",
|
||
" <td>张三丰</td>\n",
|
||
" <td>总裁</td>\n",
|
||
" <td>-1</td>\n",
|
||
" <td>60000</td>\n",
|
||
" <td>6000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>17</th>\n",
|
||
" <td>骆昊</td>\n",
|
||
" <td>架构师</td>\n",
|
||
" <td>7800</td>\n",
|
||
" <td>30000</td>\n",
|
||
" <td>5000</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>18</th>\n",
|
||
" <td>陈小刀</td>\n",
|
||
" <td>分析师</td>\n",
|
||
" <td>9800</td>\n",
|
||
" <td>10000</td>\n",
|
||
" <td>1200</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>研发部</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ename job mgr sal comm dno dname dloc\n",
|
||
"0 胡一刀 销售员 3344 1800 200 30 销售部 重庆\n",
|
||
"1 乔峰 分析师 7800 5000 1500 20 研发部 成都\n",
|
||
"2 李莫愁 设计师 2056 3500 800 20 研发部 成都\n",
|
||
"3 张无忌 程序员 2056 3200 800 20 研发部 成都\n",
|
||
"4 丘处机 程序员 2056 3400 800 20 研发部 成都\n",
|
||
"5 欧阳锋 程序员 3088 3200 800 20 研发部 成都\n",
|
||
"6 张翠山 程序员 2056 4000 800 20 研发部 成都\n",
|
||
"7 黄蓉 销售主管 7800 3000 800 30 销售部 重庆\n",
|
||
"8 杨过 会计 5566 2200 840 10 会计部 北京\n",
|
||
"9 朱九真 会计 5566 2500 880 10 会计部 北京\n",
|
||
"10 苗人凤 销售员 3344 2500 920 30 销售部 重庆\n",
|
||
"11 郭靖 出纳 5566 2000 960 10 会计部 北京\n",
|
||
"12 宋远桥 会计师 7800 4000 1000 10 会计部 北京\n",
|
||
"15 王大锤 程序员 9800 8000 600 20 研发部 成都\n",
|
||
"16 张三丰 总裁 -1 60000 6000 20 研发部 成都\n",
|
||
"17 骆昊 架构师 7800 30000 5000 20 研发部 成都\n",
|
||
"18 陈小刀 分析师 9800 10000 1200 20 研发部 成都"
|
||
]
|
||
},
|
||
"execution_count": 80,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 删除重复值\n",
|
||
"# keep='first' - 默认值,重复元素保留第一项 - 'last' / False\n",
|
||
"all_emp_df.drop_duplicates(['ename', 'job'], keep='last', inplace=True)\n",
|
||
"all_emp_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 81,
|
||
"id": "832a2ea2-6941-4364-b143-af7db9ff9701",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 异常值的甄别\n",
|
||
"# 数值判定法(data < Q1 - 1.5 * IQR 或者 data > Q3 + 1.5 * IQR)\n",
|
||
"\n",
|
||
"\n",
|
||
"def find_outliers_by_iqr(data, whis=1.5):\n",
|
||
" q1, q3 = np.quantile(data, [0.25, 0.75])\n",
|
||
" iqr = q3 - q1\n",
|
||
" return data[(data < q1 - whis * iqr) | (data > q3 + whis * iqr)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 82,
|
||
"id": "1cd5d6aa-c60e-483e-995c-a627a0dfec15",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([ 83., 81., 89., 89., 76., 79., 78., 76., 79., 74., 89.,\n",
|
||
" 61., 90., 74., 68., 81., 81., 93., 69., 81., 76., 87.,\n",
|
||
" 80., 90., 72., 89., 72., 71., 93., 75., 75., 73., 85.,\n",
|
||
" 91., 96., 82., 74., 80., 72., 83., 72., 64., 83., 79.,\n",
|
||
" 78., 68., 68., 70., 68., 84., 120., 160., 200., 40., 20.,\n",
|
||
" -50.])"
|
||
]
|
||
},
|
||
"execution_count": 82,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp = np.random.normal(80, 8, 50).round(0)\n",
|
||
"temp = np.append(temp, [120, 160, 200, 40, 20, -50])\n",
|
||
"temp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 83,
|
||
"id": "2121dab4-0efc-4fcd-a5fe-67585552cb53",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([120., 160., 200., 40., 20., -50.])"
|
||
]
|
||
},
|
||
"execution_count": 83,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_iqr(temp)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 84,
|
||
"id": "da048825-3f88-4009-9db5-159e8e883b10",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([160., 200., 20., -50.])"
|
||
]
|
||
},
|
||
"execution_count": 84,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_iqr(temp, whis=3)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 85,
|
||
"id": "0da7034b-2350-43ff-a6eb-9e7f4361bdee",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# zscore判定法(三西格玛法则 ---> 68-95-99.7法则)\n",
|
||
"\n",
|
||
"\n",
|
||
"def find_outliers_by_zscore(data, mul=3):\n",
|
||
" mu, sigma = np.mean(data), np.std(data)\n",
|
||
" zscore = (data - mu) / sigma\n",
|
||
" return data[np.abs(zscore) > mul]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 86,
|
||
"id": "e88616c0-a4d8-4fd8-9ec2-e761cb5ba056",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([200., -50.])"
|
||
]
|
||
},
|
||
"execution_count": 86,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_zscore(temp)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 87,
|
||
"id": "c902031c-2f78-4721-9734-5c5b0ca81650",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([160., 200., 20., -50.])"
|
||
]
|
||
},
|
||
"execution_count": 87,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_zscore(temp, mul=2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 88,
|
||
"id": "1e295014-d582-4e78-b5b9-6d9f0463ff8d",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"订单号\n",
|
||
"G69924 23688\n",
|
||
"G70509 31935\n",
|
||
"G72204 26758\n",
|
||
"G70509 31594\n",
|
||
"G72186 30583\n",
|
||
"G70509 52302\n",
|
||
"G69631 32125\n",
|
||
"543369-010 29843\n",
|
||
"543367-077 31889\n",
|
||
"G69627 31028\n",
|
||
"G69645 23947\n",
|
||
"G72201 40327\n",
|
||
"G69631 26534\n",
|
||
"543367-077 25674\n",
|
||
"G71332 35120\n",
|
||
"588705-010 25502\n",
|
||
"543367-077 31375\n",
|
||
"G68188 29819\n",
|
||
"577714-010 40884\n",
|
||
"FT001-18-1763 25835\n",
|
||
"G72186 24770\n",
|
||
"G71330 29795\n",
|
||
"577714-010 27244\n",
|
||
"FT007-18-1763 25454\n",
|
||
"G69627 24634\n",
|
||
"G69627 23537\n",
|
||
"G72204 31613\n",
|
||
"543367-077 45442\n",
|
||
"G85411 22861\n",
|
||
"G68188 34290\n",
|
||
"AYMH063-1 29307\n",
|
||
"G69645 37782\n",
|
||
"Name: 直接成本, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 88,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_zscore(df6.直接成本)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 89,
|
||
"id": "97b98c82-fd09-42a9-8a75-a3e71ae10fbc",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>销售日期</th>\n",
|
||
" <th>区域</th>\n",
|
||
" <th>渠道</th>\n",
|
||
" <th>品牌</th>\n",
|
||
" <th>售价</th>\n",
|
||
" <th>销售数量</th>\n",
|
||
" <th>直接成本</th>\n",
|
||
" <th>销售额</th>\n",
|
||
" <th>月份</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>订单号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>205654-519</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>169</td>\n",
|
||
" <td>14</td>\n",
|
||
" <td>485</td>\n",
|
||
" <td>2366</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>377781-010</th>\n",
|
||
" <td>2020-01-01</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>249</td>\n",
|
||
" <td>61</td>\n",
|
||
" <td>2452</td>\n",
|
||
" <td>15189</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>588685-002</th>\n",
|
||
" <td>2020-01-02</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>299</td>\n",
|
||
" <td>91</td>\n",
|
||
" <td>8008</td>\n",
|
||
" <td>27209</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLH641-1</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>239</td>\n",
|
||
" <td>82</td>\n",
|
||
" <td>4127</td>\n",
|
||
" <td>19598</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>AKLJ013-4</th>\n",
|
||
" <td>2020-01-03</td>\n",
|
||
" <td>上海</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>壁虎</td>\n",
|
||
" <td>219</td>\n",
|
||
" <td>57</td>\n",
|
||
" <td>2315</td>\n",
|
||
" <td>12483</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>588682-010</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>拼多多</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>50</td>\n",
|
||
" <td>4388</td>\n",
|
||
" <td>13450</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>599007-513</th>\n",
|
||
" <td>2020-12-29</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>天猫</td>\n",
|
||
" <td>皮皮虾</td>\n",
|
||
" <td>349</td>\n",
|
||
" <td>18</td>\n",
|
||
" <td>2466</td>\n",
|
||
" <td>6282</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>D89677</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>北京</td>\n",
|
||
" <td>京东</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>26</td>\n",
|
||
" <td>1560</td>\n",
|
||
" <td>6994</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>182719-050</th>\n",
|
||
" <td>2020-12-30</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>八匹马</td>\n",
|
||
" <td>79</td>\n",
|
||
" <td>97</td>\n",
|
||
" <td>3028</td>\n",
|
||
" <td>7663</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>G70083</th>\n",
|
||
" <td>2020-12-31</td>\n",
|
||
" <td>福建</td>\n",
|
||
" <td>实体</td>\n",
|
||
" <td>花花姑娘</td>\n",
|
||
" <td>269</td>\n",
|
||
" <td>55</td>\n",
|
||
" <td>2277</td>\n",
|
||
" <td>14795</td>\n",
|
||
" <td>12</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>1711 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 销售日期 区域 渠道 品牌 售价 销售数量 直接成本 销售额 月份\n",
|
||
"订单号 \n",
|
||
"205654-519 2020-01-01 上海 天猫 八匹马 169 14 485 2366 1\n",
|
||
"377781-010 2020-01-01 上海 天猫 皮皮虾 249 61 2452 15189 1\n",
|
||
"588685-002 2020-01-02 上海 拼多多 皮皮虾 299 91 8008 27209 1\n",
|
||
"AKLH641-1 2020-01-03 上海 天猫 壁虎 239 82 4127 19598 1\n",
|
||
"AKLJ013-4 2020-01-03 上海 天猫 壁虎 219 57 2315 12483 1\n",
|
||
"... ... .. ... ... ... ... ... ... ..\n",
|
||
"588682-010 2020-12-29 北京 拼多多 皮皮虾 269 50 4388 13450 12\n",
|
||
"599007-513 2020-12-29 北京 天猫 皮皮虾 349 18 2466 6282 12\n",
|
||
"D89677 2020-12-30 北京 京东 花花姑娘 269 26 1560 6994 12\n",
|
||
"182719-050 2020-12-30 福建 实体 八匹马 79 97 3028 7663 12\n",
|
||
"G70083 2020-12-31 福建 实体 花花姑娘 269 55 2277 14795 12\n",
|
||
"\n",
|
||
"[1711 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 89,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 根据离群点的行索引删除行\n",
|
||
"df6.drop(index=find_outliers_by_zscore(df6.直接成本).index)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 90,
|
||
"id": "0053ed12-c09f-4331-a6dd-487ff990c680",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"79.0"
|
||
]
|
||
},
|
||
"execution_count": 90,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"med_value = np.median(temp)\n",
|
||
"med_value"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 91,
|
||
"id": "f02c2985-1b07-4b1c-b248-aa1de9e98451",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([160., 200., 20., -50.])"
|
||
]
|
||
},
|
||
"execution_count": 91,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"find_outliers_by_zscore(temp, mul=2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 92,
|
||
"id": "485adc15-f39d-419b-9869-2b366f5d88ec",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([False, False, False, False, False, False, False, False, False,\n",
|
||
" False, False, False, False, False, False, False, False, False,\n",
|
||
" False, False, False, False, False, False, False, False, False,\n",
|
||
" False, False, False, False, False, False, False, False, False,\n",
|
||
" False, False, False, False, False, False, False, False, False,\n",
|
||
" False, False, False, False, False, False, True, True, False,\n",
|
||
" True, True])"
|
||
]
|
||
},
|
||
"execution_count": 92,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"np.in1d(temp, find_outliers_by_zscore(temp, mul=2))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 93,
|
||
"id": "ce92f242-1f0f-476e-ae85-91e1615783ef",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 替换离群点\n",
|
||
"np.place(temp, np.in1d(temp, find_outliers_by_zscore(temp, mul=2)), med_value)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 94,
|
||
"id": "10b0b0bc-f98c-40fe-890f-976df9d9c52b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([ 83., 81., 89., 89., 76., 79., 78., 76., 79., 74., 89.,\n",
|
||
" 61., 90., 74., 68., 81., 81., 93., 69., 81., 76., 87.,\n",
|
||
" 80., 90., 72., 89., 72., 71., 93., 75., 75., 73., 85.,\n",
|
||
" 91., 96., 82., 74., 80., 72., 83., 72., 64., 83., 79.,\n",
|
||
" 78., 68., 68., 70., 68., 84., 120., 79., 79., 40., 79.,\n",
|
||
" 79.])"
|
||
]
|
||
},
|
||
"execution_count": 94,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"temp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d970e838-42f2-44d0-8f2d-07ebbf6de2b0",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 案例1:招聘数据清洗和预处理\n",
|
||
"\n",
|
||
"1. 数据加载\n",
|
||
"2. 去重\n",
|
||
"3. 数据抽取\n",
|
||
"4. 拆分列\n",
|
||
"5. 替换值\n",
|
||
"6. 数据筛选"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 95,
|
||
"id": "1ec417a9-457f-434e-96a6-f4fd35d75987",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>company_name</th>\n",
|
||
" <th>uri</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>site</th>\n",
|
||
" <th>year</th>\n",
|
||
" <th>edu</th>\n",
|
||
" <th>job_name</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>pos_count</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>软通动力集团</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/7ece55fbcbf7...</td>\n",
|
||
" <td>10-15K</td>\n",
|
||
" <td>成都 武侯区 草金立交</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>2</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>思湃德</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/760b2b05535c...</td>\n",
|
||
" <td>20-40K</td>\n",
|
||
" <td>成都 双流区 华阳</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>5</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>源码时代</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/9575f02d9a9f...</td>\n",
|
||
" <td>15-20K</td>\n",
|
||
" <td>成都 武侯区 石羊</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>python 讲师</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>三源合众</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/912b6da8b12f...</td>\n",
|
||
" <td>6-10K</td>\n",
|
||
" <td>成都 武侯区 新会展</td>\n",
|
||
" <td>1年以内</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>软通动力</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/c61ef9b261da...</td>\n",
|
||
" <td>8-13K</td>\n",
|
||
" <td>成都 武侯区 机投</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>3</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5</th>\n",
|
||
" <td>中佳业</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/4fbb387a96ff...</td>\n",
|
||
" <td>7-8K·13薪</td>\n",
|
||
" <td>成都 武侯区 肖家河</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>C++/Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>4</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6</th>\n",
|
||
" <td>知行天下</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/17fe77fdd3b1...</td>\n",
|
||
" <td>7-12K</td>\n",
|
||
" <td>成都 龙泉驿区 龙泉</td>\n",
|
||
" <td>1年以内</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>Python讲师</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7</th>\n",
|
||
" <td>川大智胜</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/77186b69e915...</td>\n",
|
||
" <td>8-13K</td>\n",
|
||
" <td>成都 武侯区 保利花园</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8</th>\n",
|
||
" <td>电科荷福研究院</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/586d4207f3d7...</td>\n",
|
||
" <td>7-12K</td>\n",
|
||
" <td>成都 郫都区 高新西</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python后端开发</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9</th>\n",
|
||
" <td>傲梦网络科技</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/dafa272932d0...</td>\n",
|
||
" <td>3-8K</td>\n",
|
||
" <td>成都 武侯区 高升桥</td>\n",
|
||
" <td>经验不限</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python线上试听课老师</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" company_name uri salary \\\n",
|
||
"0 软通动力集团 https://www.zhipin.com/job_detail/7ece55fbcbf7... 10-15K \n",
|
||
"1 思湃德 https://www.zhipin.com/job_detail/760b2b05535c... 20-40K \n",
|
||
"2 源码时代 https://www.zhipin.com/job_detail/9575f02d9a9f... 15-20K \n",
|
||
"3 三源合众 https://www.zhipin.com/job_detail/912b6da8b12f... 6-10K \n",
|
||
"4 软通动力 https://www.zhipin.com/job_detail/c61ef9b261da... 8-13K \n",
|
||
"5 中佳业 https://www.zhipin.com/job_detail/4fbb387a96ff... 7-8K·13薪 \n",
|
||
"6 知行天下 https://www.zhipin.com/job_detail/17fe77fdd3b1... 7-12K \n",
|
||
"7 川大智胜 https://www.zhipin.com/job_detail/77186b69e915... 8-13K \n",
|
||
"8 电科荷福研究院 https://www.zhipin.com/job_detail/586d4207f3d7... 7-12K \n",
|
||
"9 傲梦网络科技 https://www.zhipin.com/job_detail/dafa272932d0... 3-8K \n",
|
||
"\n",
|
||
" site year edu job_name city pos_count \n",
|
||
"0 成都 武侯区 草金立交 1-3年 本科 python开发 chengdu 2 \n",
|
||
"1 成都 双流区 华阳 3-5年 本科 Python chengdu 5 \n",
|
||
"2 成都 武侯区 石羊 3-5年 大专 python 讲师 chengdu 3 \n",
|
||
"3 成都 武侯区 新会展 1年以内 本科 Python chengdu 1 \n",
|
||
"4 成都 武侯区 机投 1-3年 本科 python开发 chengdu 3 \n",
|
||
"5 成都 武侯区 肖家河 3-5年 大专 C++/Python chengdu 4 \n",
|
||
"6 成都 龙泉驿区 龙泉 1年以内 大专 Python讲师 chengdu 1 \n",
|
||
"7 成都 武侯区 保利花园 3-5年 本科 Python chengdu 1 \n",
|
||
"8 成都 郫都区 高新西 3-5年 本科 python后端开发 chengdu 6 \n",
|
||
"9 成都 武侯区 高升桥 经验不限 本科 python线上试听课老师 chengdu 6 "
|
||
]
|
||
},
|
||
"execution_count": 95,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"jobs_df = pd.read_csv('res/all_jobs.csv')\n",
|
||
"jobs_df.head(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 96,
|
||
"id": "74e0e4a5-3c03-4617-9661-8cfa03b88fd7",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(9777, 9)"
|
||
]
|
||
},
|
||
"execution_count": 96,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 根据URI列去重\n",
|
||
"jobs_df.drop_duplicates('uri', inplace=True)\n",
|
||
"jobs_df.shape"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 97,
|
||
"id": "6cca7b8b-25f1-46b8-9946-34ba90f42116",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>company_name</th>\n",
|
||
" <th>uri</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>site</th>\n",
|
||
" <th>year</th>\n",
|
||
" <th>edu</th>\n",
|
||
" <th>job_name</th>\n",
|
||
" <th>city</th>\n",
|
||
" <th>pos_count</th>\n",
|
||
" <th>salary_lower</th>\n",
|
||
" <th>salary_upper</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>软通动力集团</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/7ece55fbcbf7...</td>\n",
|
||
" <td>12.5</td>\n",
|
||
" <td>成都 武侯区 草金立交</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>15</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>思湃德</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/760b2b05535c...</td>\n",
|
||
" <td>30.0</td>\n",
|
||
" <td>成都 双流区 华阳</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>40</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>源码时代</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/9575f02d9a9f...</td>\n",
|
||
" <td>17.5</td>\n",
|
||
" <td>成都 武侯区 石羊</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>python 讲师</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>三源合众</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/912b6da8b12f...</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>成都 武侯区 新会展</td>\n",
|
||
" <td>1年以内</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>软通动力</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/c61ef9b261da...</td>\n",
|
||
" <td>10.5</td>\n",
|
||
" <td>成都 武侯区 机投</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>chengdu</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>13</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9820</th>\n",
|
||
" <td>公众智能</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/7b9c08dbce81...</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>西安</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>xian</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9821</th>\n",
|
||
" <td>微感</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/c7e99005528f...</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>西安 雁塔区 紫薇田园都市</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>xian</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9822</th>\n",
|
||
" <td>巴斯光年</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/1045fe64f248...</td>\n",
|
||
" <td>15.0</td>\n",
|
||
" <td>西安 雁塔区 大雁塔</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>xian</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9823</th>\n",
|
||
" <td>西大华特科技</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/e3c21cc748e7...</td>\n",
|
||
" <td>6.5</td>\n",
|
||
" <td>西安 雁塔区 唐延路</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>硕士</td>\n",
|
||
" <td>产品经理(农药)</td>\n",
|
||
" <td>xian</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9824</th>\n",
|
||
" <td>西安纯粹科技</td>\n",
|
||
" <td>https://www.zhipin.com/job_detail/09965129db3e...</td>\n",
|
||
" <td>4.5</td>\n",
|
||
" <td>西安 雁塔区 玫瑰大楼</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>xian</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>9777 rows × 11 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" company_name uri salary \\\n",
|
||
"0 软通动力集团 https://www.zhipin.com/job_detail/7ece55fbcbf7... 12.5 \n",
|
||
"1 思湃德 https://www.zhipin.com/job_detail/760b2b05535c... 30.0 \n",
|
||
"2 源码时代 https://www.zhipin.com/job_detail/9575f02d9a9f... 17.5 \n",
|
||
"3 三源合众 https://www.zhipin.com/job_detail/912b6da8b12f... 8.0 \n",
|
||
"4 软通动力 https://www.zhipin.com/job_detail/c61ef9b261da... 10.5 \n",
|
||
"... ... ... ... \n",
|
||
"9820 公众智能 https://www.zhipin.com/job_detail/7b9c08dbce81... 9.0 \n",
|
||
"9821 微感 https://www.zhipin.com/job_detail/c7e99005528f... 9.0 \n",
|
||
"9822 巴斯光年 https://www.zhipin.com/job_detail/1045fe64f248... 15.0 \n",
|
||
"9823 西大华特科技 https://www.zhipin.com/job_detail/e3c21cc748e7... 6.5 \n",
|
||
"9824 西安纯粹科技 https://www.zhipin.com/job_detail/09965129db3e... 4.5 \n",
|
||
"\n",
|
||
" site year edu job_name city pos_count salary_lower \\\n",
|
||
"0 成都 武侯区 草金立交 1-3年 本科 python开发 chengdu 2 10 \n",
|
||
"1 成都 双流区 华阳 3-5年 本科 Python chengdu 5 20 \n",
|
||
"2 成都 武侯区 石羊 3-5年 大专 python 讲师 chengdu 3 15 \n",
|
||
"3 成都 武侯区 新会展 1年以内 本科 Python chengdu 1 6 \n",
|
||
"4 成都 武侯区 机投 1-3年 本科 python开发 chengdu 3 8 \n",
|
||
"... ... ... .. ... ... ... ... \n",
|
||
"9820 西安 3-5年 本科 产品经理 xian 2 8 \n",
|
||
"9821 西安 雁塔区 紫薇田园都市 3-5年 大专 产品经理 xian 4 8 \n",
|
||
"9822 西安 雁塔区 大雁塔 3-5年 本科 产品经理 xian 6 10 \n",
|
||
"9823 西安 雁塔区 唐延路 1-3年 硕士 产品经理(农药) xian 6 5 \n",
|
||
"9824 西安 雁塔区 玫瑰大楼 1-3年 本科 产品经理 xian 5 3 \n",
|
||
"\n",
|
||
" salary_upper \n",
|
||
"0 15 \n",
|
||
"1 40 \n",
|
||
"2 20 \n",
|
||
"3 10 \n",
|
||
"4 13 \n",
|
||
"... ... \n",
|
||
"9820 10 \n",
|
||
"9821 10 \n",
|
||
"9822 20 \n",
|
||
"9823 8 \n",
|
||
"9824 6 \n",
|
||
"\n",
|
||
"[9777 rows x 11 columns]"
|
||
]
|
||
},
|
||
"execution_count": 97,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 通过正则表达式从列中提取信息\n",
|
||
"jobs_df[['salary_lower', 'salary_upper']] = jobs_df.salary.str.extract(r'(\\d+)-(\\d+)').astype('i8')\n",
|
||
"jobs_df['salary'] = (jobs_df.salary_lower + jobs_df.salary_upper) / 2\n",
|
||
"jobs_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 98,
|
||
"id": "ffaea2af-09f6-4577-9c0d-024966d6854f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>company_name</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>site</th>\n",
|
||
" <th>year</th>\n",
|
||
" <th>edu</th>\n",
|
||
" <th>job_name</th>\n",
|
||
" <th>pos_count</th>\n",
|
||
" <th>salary_lower</th>\n",
|
||
" <th>salary_upper</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>软通动力集团</td>\n",
|
||
" <td>12.5</td>\n",
|
||
" <td>成都 武侯区 草金立交</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>15</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>思湃德</td>\n",
|
||
" <td>30.0</td>\n",
|
||
" <td>成都 双流区 华阳</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>40</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>源码时代</td>\n",
|
||
" <td>17.5</td>\n",
|
||
" <td>成都 武侯区 石羊</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>python 讲师</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>三源合众</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>成都 武侯区 新会展</td>\n",
|
||
" <td>1年以内</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>软通动力</td>\n",
|
||
" <td>10.5</td>\n",
|
||
" <td>成都 武侯区 机投</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>13</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9820</th>\n",
|
||
" <td>公众智能</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>西安</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9821</th>\n",
|
||
" <td>微感</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>西安 雁塔区 紫薇田园都市</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9822</th>\n",
|
||
" <td>巴斯光年</td>\n",
|
||
" <td>15.0</td>\n",
|
||
" <td>西安 雁塔区 大雁塔</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>20</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9823</th>\n",
|
||
" <td>西大华特科技</td>\n",
|
||
" <td>6.5</td>\n",
|
||
" <td>西安 雁塔区 唐延路</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>硕士</td>\n",
|
||
" <td>产品经理(农药)</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>8</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9824</th>\n",
|
||
" <td>西安纯粹科技</td>\n",
|
||
" <td>4.5</td>\n",
|
||
" <td>西安 雁塔区 玫瑰大楼</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>6</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>9777 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" company_name salary site year edu job_name pos_count \\\n",
|
||
"0 软通动力集团 12.5 成都 武侯区 草金立交 1-3年 本科 python开发 2 \n",
|
||
"1 思湃德 30.0 成都 双流区 华阳 3-5年 本科 Python 5 \n",
|
||
"2 源码时代 17.5 成都 武侯区 石羊 3-5年 大专 python 讲师 3 \n",
|
||
"3 三源合众 8.0 成都 武侯区 新会展 1年以内 本科 Python 1 \n",
|
||
"4 软通动力 10.5 成都 武侯区 机投 1-3年 本科 python开发 3 \n",
|
||
"... ... ... ... ... .. ... ... \n",
|
||
"9820 公众智能 9.0 西安 3-5年 本科 产品经理 2 \n",
|
||
"9821 微感 9.0 西安 雁塔区 紫薇田园都市 3-5年 大专 产品经理 4 \n",
|
||
"9822 巴斯光年 15.0 西安 雁塔区 大雁塔 3-5年 本科 产品经理 6 \n",
|
||
"9823 西大华特科技 6.5 西安 雁塔区 唐延路 1-3年 硕士 产品经理(农药) 6 \n",
|
||
"9824 西安纯粹科技 4.5 西安 雁塔区 玫瑰大楼 1-3年 本科 产品经理 5 \n",
|
||
"\n",
|
||
" salary_lower salary_upper \n",
|
||
"0 10 15 \n",
|
||
"1 20 40 \n",
|
||
"2 15 20 \n",
|
||
"3 6 10 \n",
|
||
"4 8 13 \n",
|
||
"... ... ... \n",
|
||
"9820 8 10 \n",
|
||
"9821 8 10 \n",
|
||
"9822 10 20 \n",
|
||
"9823 5 8 \n",
|
||
"9824 3 6 \n",
|
||
"\n",
|
||
"[9777 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 98,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"jobs_df.drop(columns=['uri', 'city'], inplace=True)\n",
|
||
"jobs_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 99,
|
||
"id": "d9ba5998-ca1d-44c8-87ca-363356074dd5",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>company_name</th>\n",
|
||
" <th>salary</th>\n",
|
||
" <th>year</th>\n",
|
||
" <th>edu</th>\n",
|
||
" <th>job_name</th>\n",
|
||
" <th>pos_count</th>\n",
|
||
" <th>salary_lower</th>\n",
|
||
" <th>salary_upper</th>\n",
|
||
" <th>city</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>软通动力集团</td>\n",
|
||
" <td>12.5</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>思湃德</td>\n",
|
||
" <td>30.0</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>2</th>\n",
|
||
" <td>源码时代</td>\n",
|
||
" <td>17.5</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>python 讲师</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>15</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3</th>\n",
|
||
" <td>三源合众</td>\n",
|
||
" <td>8.0</td>\n",
|
||
" <td>1年以内</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>1</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>4</th>\n",
|
||
" <td>软通动力</td>\n",
|
||
" <td>10.5</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>python开发</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>13</td>\n",
|
||
" <td>成都</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9820</th>\n",
|
||
" <td>公众智能</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>2</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>西安</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9821</th>\n",
|
||
" <td>微感</td>\n",
|
||
" <td>9.0</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>大专</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>4</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>西安</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9822</th>\n",
|
||
" <td>巴斯光年</td>\n",
|
||
" <td>15.0</td>\n",
|
||
" <td>3-5年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>10</td>\n",
|
||
" <td>20</td>\n",
|
||
" <td>西安</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9823</th>\n",
|
||
" <td>西大华特科技</td>\n",
|
||
" <td>6.5</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>硕士</td>\n",
|
||
" <td>产品经理(农药)</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>8</td>\n",
|
||
" <td>西安</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9824</th>\n",
|
||
" <td>西安纯粹科技</td>\n",
|
||
" <td>4.5</td>\n",
|
||
" <td>1-3年</td>\n",
|
||
" <td>本科</td>\n",
|
||
" <td>产品经理</td>\n",
|
||
" <td>5</td>\n",
|
||
" <td>3</td>\n",
|
||
" <td>6</td>\n",
|
||
" <td>西安</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>9777 rows × 9 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" company_name salary year edu job_name pos_count salary_lower \\\n",
|
||
"0 软通动力集团 12.5 1-3年 本科 python开发 2 10 \n",
|
||
"1 思湃德 30.0 3-5年 本科 Python 5 20 \n",
|
||
"2 源码时代 17.5 3-5年 大专 python 讲师 3 15 \n",
|
||
"3 三源合众 8.0 1年以内 本科 Python 1 6 \n",
|
||
"4 软通动力 10.5 1-3年 本科 python开发 3 8 \n",
|
||
"... ... ... ... .. ... ... ... \n",
|
||
"9820 公众智能 9.0 3-5年 本科 产品经理 2 8 \n",
|
||
"9821 微感 9.0 3-5年 大专 产品经理 4 8 \n",
|
||
"9822 巴斯光年 15.0 3-5年 本科 产品经理 6 10 \n",
|
||
"9823 西大华特科技 6.5 1-3年 硕士 产品经理(农药) 6 5 \n",
|
||
"9824 西安纯粹科技 4.5 1-3年 本科 产品经理 5 3 \n",
|
||
"\n",
|
||
" salary_upper city \n",
|
||
"0 15 成都 \n",
|
||
"1 40 成都 \n",
|
||
"2 20 成都 \n",
|
||
"3 10 成都 \n",
|
||
"4 13 成都 \n",
|
||
"... ... ... \n",
|
||
"9820 10 西安 \n",
|
||
"9821 10 西安 \n",
|
||
"9822 20 西安 \n",
|
||
"9823 8 西安 \n",
|
||
"9824 6 西安 \n",
|
||
"\n",
|
||
"[9777 rows x 9 columns]"
|
||
]
|
||
},
|
||
"execution_count": 99,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 拆分列\n",
|
||
"jobs_df['city'] = jobs_df.site.str.split(expand=True)[0]\n",
|
||
"jobs_df.drop(columns='site', inplace=True)\n",
|
||
"jobs_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 100,
|
||
"id": "933e9006-4f5e-4238-b6d9-940dfeb6caf1",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 字符串正则表达式替换\n",
|
||
"jobs_df['year'] = jobs_df.year.replace(r'5-10年|10年以上', '5年以上', regex=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 101,
|
||
"id": "d10a9c1c-a9d5-49e1-8fdf-a68b5bb3d59a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array(['1-3年', '3-5年', '1年以内', '经验不限', '5年以上', '应届生'], dtype=object)"
|
||
]
|
||
},
|
||
"execution_count": 101,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"jobs_df.year.unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 102,
|
||
"id": "d248e233-bac5-48d5-8a69-a1f04350867a",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"jobs_df['edu'] = jobs_df.edu.replace(r'中专|高中', '学历不限', regex=True)\n",
|
||
"jobs_df['edu'] = jobs_df.edu.replace(r'硕士|博士', '研究生', regex=True)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 103,
|
||
"id": "eec6fbd5-2355-4674-9e5d-7f47a5a808a2",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array(['本科', '大专', '学历不限', '研究生'], dtype=object)"
|
||
]
|
||
},
|
||
"execution_count": 103,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"jobs_df.edu.unique()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 104,
|
||
"id": "352b1921-aa2b-4016-af3e-02032b2a3935",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(6487, 9)"
|
||
]
|
||
},
|
||
"execution_count": 104,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"jobs_df['job_name'] = jobs_df.job_name.str.lower()\n",
|
||
"jobs_df = jobs_df[jobs_df.job_name.str.contains('python|数据|产品|运营|data', regex=True)]\n",
|
||
"jobs_df.shape"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 105,
|
||
"id": "df370013-1278-48d2-9891-8647df3c5e15",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"jobs_df.to_csv('res/cleand_jobs.csv', index=False)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8ee07676-737c-420e-b11a-235ff7f2c4c8",
|
||
"metadata": {},
|
||
"source": [
|
||
"#### 案例2:北京积分落户数据预处理\n",
|
||
"\n",
|
||
"1. 加载数据\n",
|
||
"2. 日期时间处理\n",
|
||
"3. 年龄段分箱\n",
|
||
"4. 落户积分归一化"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 106,
|
||
"id": "1232d023-7591-47b3-b67b-4920642dd28d",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>202300001</th>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300002</th>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300003</th>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300004</th>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300005</th>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 姓名 出生年月 单位名称 积分分值\n",
|
||
"公示编号 \n",
|
||
"202300001 张浩 1977-02 北京首钢股份有限公司 140.05\n",
|
||
"202300002 冯云 1982-02 中国人民解放军空军二十三厂 134.29\n",
|
||
"202300003 王天东 1975-01 中建二局第三建筑工程有限公司 133.63\n",
|
||
"202300004 陈军 1976-07 中建二局第三建筑工程有限公司 133.29\n",
|
||
"202300005 樊海瑞 1981-06 中国民生银行股份有限公司 132.46"
|
||
]
|
||
},
|
||
"execution_count": 106,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"settle_df = pd.read_csv('res/2023年北京积分落户数据.csv', index_col='公示编号')\n",
|
||
"settle_df.head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 107,
|
||
"id": "734eb268-3ad7-4e67-9661-08328075992b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 6003 entries, 202300001 to 202306003\n",
|
||
"Data columns (total 4 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 姓名 6003 non-null object \n",
|
||
" 1 出生年月 6003 non-null object \n",
|
||
" 2 单位名称 6003 non-null object \n",
|
||
" 3 积分分值 6003 non-null float64\n",
|
||
"dtypes: float64(1), object(3)\n",
|
||
"memory usage: 234.5+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"settle_df.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 108,
|
||
"id": "63698465-ddcd-430c-bd96-e78abaaebda3",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 6003 entries, 202300001 to 202306003\n",
|
||
"Data columns (total 4 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 姓名 6003 non-null object \n",
|
||
" 1 出生年月 6003 non-null datetime64[ns]\n",
|
||
" 2 单位名称 6003 non-null object \n",
|
||
" 3 积分分值 6003 non-null float64 \n",
|
||
"dtypes: datetime64[ns](1), float64(1), object(2)\n",
|
||
"memory usage: 234.5+ KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将字符串处理成日期\n",
|
||
"settle_df['出生年月'] = pd.to_datetime(settle_df['出生年月'])\n",
|
||
"settle_df.info()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 109,
|
||
"id": "989c56c7-85fa-4180-9b86-5247a41cdbab",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" <th>年龄</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>202300001</th>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02-01</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" <td>45</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300002</th>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02-01</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" <td>40</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300003</th>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" <td>48</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300004</th>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" <td>46</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300005</th>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06-01</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" <td>41</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 姓名 出生年月 单位名称 积分分值 年龄\n",
|
||
"公示编号 \n",
|
||
"202300001 张浩 1977-02-01 北京首钢股份有限公司 140.05 45\n",
|
||
"202300002 冯云 1982-02-01 中国人民解放军空军二十三厂 134.29 40\n",
|
||
"202300003 王天东 1975-01-01 中建二局第三建筑工程有限公司 133.63 48\n",
|
||
"202300004 陈军 1976-07-01 中建二局第三建筑工程有限公司 133.29 46\n",
|
||
"202300005 樊海瑞 1981-06-01 中国民生银行股份有限公司 132.46 41"
|
||
]
|
||
},
|
||
"execution_count": 109,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将生日换算成年龄\n",
|
||
"settle_df['年龄'] = (pd.to_datetime('2023-01-01') - settle_df.出生年月).dt.days // 365\n",
|
||
"settle_df.head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 110,
|
||
"id": "4191c7a2-19fd-4347-ac79-2371c8e59c10",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" <th>年龄</th>\n",
|
||
" <th>年龄段</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>202300001</th>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02-01</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" <td>45</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300002</th>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02-01</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300003</th>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" <td>48</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300004</th>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300005</th>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06-01</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 姓名 出生年月 单位名称 积分分值 年龄 年龄段\n",
|
||
"公示编号 \n",
|
||
"202300001 张浩 1977-02-01 北京首钢股份有限公司 140.05 45 45~49岁\n",
|
||
"202300002 冯云 1982-02-01 中国人民解放军空军二十三厂 134.29 40 40~44岁\n",
|
||
"202300003 王天东 1975-01-01 中建二局第三建筑工程有限公司 133.63 48 45~49岁\n",
|
||
"202300004 陈军 1976-07-01 中建二局第三建筑工程有限公司 133.29 46 45~49岁\n",
|
||
"202300005 樊海瑞 1981-06-01 中国民生银行股份有限公司 132.46 41 40~44岁"
|
||
]
|
||
},
|
||
"execution_count": 110,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将年龄划分到年龄段 - 分箱 - 数据桶\n",
|
||
"settle_df['年龄段'] = pd.cut(\n",
|
||
" settle_df.年龄,\n",
|
||
" bins=np.arange(35, 61, 5),\n",
|
||
" labels=['35~39岁', '40~44岁', '45~49岁', '50~54岁', '55~59岁'],\n",
|
||
" right=False\n",
|
||
")\n",
|
||
"settle_df.head(5)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 111,
|
||
"id": "ea2e0c9b-0aa0-41d3-a52a-6926b797465c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"年龄段\n",
|
||
"40~44岁 4215\n",
|
||
"45~49岁 1053\n",
|
||
"35~39岁 681\n",
|
||
"50~54岁 34\n",
|
||
"55~59岁 20\n",
|
||
"Name: count, dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 111,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 统计每个元素出现的频次\n",
|
||
"temp = settle_df.年龄段.value_counts()\n",
|
||
"temp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 112,
|
||
"id": "30843274-b940-4527-92ed-97db86bb4ec7",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[0. , 0.39277201, 0.15816993, 1. ],\n",
|
||
" [0.18246828, 0.59332564, 0.30675894, 1. ],\n",
|
||
" [0.45176471, 0.76708958, 0.46120723, 1. ],\n",
|
||
" [0.72312188, 0.88961169, 0.69717801, 1. ],\n",
|
||
" [0.91326413, 0.96670511, 0.89619377, 1. ]])"
|
||
]
|
||
},
|
||
"execution_count": 112,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"plt.cm.Greens(np.linspace(0.9, 0.1, 5))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 113,
|
||
"id": "375dd407-9d0a-4788-a38e-3a37efbb6d3b",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/svg+xml": [
|
||
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
|
||
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
|
||
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
|
||
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"499.925pt\" height=\"252.574062pt\" viewBox=\"0 0 499.925 252.574062\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
|
||
" <metadata>\n",
|
||
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
|
||
" <cc:Work>\n",
|
||
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
|
||
" <dc:date>2025-03-04T16:47:31.005270</dc:date>\n",
|
||
" <dc:format>image/svg+xml</dc:format>\n",
|
||
" <dc:creator>\n",
|
||
" <cc:Agent>\n",
|
||
" <dc:title>Matplotlib v3.9.4, https://matplotlib.org/</dc:title>\n",
|
||
" </cc:Agent>\n",
|
||
" </dc:creator>\n",
|
||
" </cc:Work>\n",
|
||
" </rdf:RDF>\n",
|
||
" </metadata>\n",
|
||
" <defs>\n",
|
||
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
|
||
" </defs>\n",
|
||
" <g id=\"figure_1\">\n",
|
||
" <g id=\"patch_1\">\n",
|
||
" <path d=\"M 0 252.574062 \n",
|
||
"L 499.925 252.574062 \n",
|
||
"L 499.925 0 \n",
|
||
"L 0 0 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"axes_1\">\n",
|
||
" <g id=\"patch_2\">\n",
|
||
" <path d=\"M 46.325 228.96 \n",
|
||
"L 492.725 228.96 \n",
|
||
"L 492.725 7.2 \n",
|
||
"L 46.325 7.2 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_3\">\n",
|
||
" <path d=\"M 68.645 228.96 \n",
|
||
"L 113.285 228.96 \n",
|
||
"L 113.285 17.76 \n",
|
||
"L 68.645 17.76 \n",
|
||
"z\n",
|
||
"\" clip-path=\"url(#p703aa4411f)\" style=\"fill: url(#h3de5bce4dd)\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_4\">\n",
|
||
" <path d=\"M 157.925 228.96 \n",
|
||
"L 202.565 228.96 \n",
|
||
"L 202.565 176.19758 \n",
|
||
"L 157.925 176.19758 \n",
|
||
"z\n",
|
||
"\" clip-path=\"url(#p703aa4411f)\" style=\"fill: url(#h9cf71d2063)\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_5\">\n",
|
||
" <path d=\"M 247.205 228.96 \n",
|
||
"L 291.845 228.96 \n",
|
||
"L 291.845 194.837295 \n",
|
||
"L 247.205 194.837295 \n",
|
||
"z\n",
|
||
"\" clip-path=\"url(#p703aa4411f)\" style=\"fill: url(#h43492deada)\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_6\">\n",
|
||
" <path d=\"M 336.485 228.96 \n",
|
||
"L 381.125 228.96 \n",
|
||
"L 381.125 227.25637 \n",
|
||
"L 336.485 227.25637 \n",
|
||
"z\n",
|
||
"\" clip-path=\"url(#p703aa4411f)\" style=\"fill: url(#h6d0bdebefd)\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_7\">\n",
|
||
" <path d=\"M 425.765 228.96 \n",
|
||
"L 470.405 228.96 \n",
|
||
"L 470.405 227.957865 \n",
|
||
"L 425.765 227.957865 \n",
|
||
"z\n",
|
||
"\" clip-path=\"url(#p703aa4411f)\" style=\"fill: url(#h6b47a2d288)\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_1\">\n",
|
||
" <g id=\"xtick_1\">\n",
|
||
" <g id=\"line2d_1\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"m2b3d741f5c\" d=\"M 0 0 \n",
|
||
"L 0 3.5 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m2b3d741f5c\" x=\"90.965\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_1\">\n",
|
||
" <!-- 40~44岁 -->\n",
|
||
" <g transform=\"translate(73.465 244.085) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-34\" d=\"M 2975 1200 \n",
|
||
"L 2450 1200 \n",
|
||
"L 2450 100 \n",
|
||
"L 1875 100 \n",
|
||
"L 1875 1200 \n",
|
||
"L 200 1200 \n",
|
||
"L 200 1675 \n",
|
||
"L 1875 4425 \n",
|
||
"L 2450 4425 \n",
|
||
"L 2450 1675 \n",
|
||
"L 2975 1675 \n",
|
||
"L 2975 1200 \n",
|
||
"z\n",
|
||
"M 1875 1675 \n",
|
||
"L 1875 3525 \n",
|
||
"L 750 1675 \n",
|
||
"L 1875 1675 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-30\" d=\"M 2975 2250 \n",
|
||
"Q 2975 1350 2650 700 \n",
|
||
"Q 2325 50 1600 50 \n",
|
||
"Q 875 50 537 700 \n",
|
||
"Q 200 1350 200 2250 \n",
|
||
"Q 200 3150 537 3787 \n",
|
||
"Q 875 4425 1600 4425 \n",
|
||
"Q 2325 4425 2650 3787 \n",
|
||
"Q 2975 3150 2975 2250 \n",
|
||
"z\n",
|
||
"M 2375 2250 \n",
|
||
"Q 2375 3050 2187 3500 \n",
|
||
"Q 2000 3950 1600 3950 \n",
|
||
"Q 1200 3950 1000 3500 \n",
|
||
"Q 800 3050 800 2250 \n",
|
||
"Q 800 1450 1000 987 \n",
|
||
"Q 1200 525 1600 525 \n",
|
||
"Q 2000 525 2187 987 \n",
|
||
"Q 2375 1450 2375 2250 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-7e\" d=\"M 2925 5050 \n",
|
||
"Q 2775 4700 2587 4462 \n",
|
||
"Q 2400 4225 2150 4225 \n",
|
||
"Q 1975 4225 1650 4550 \n",
|
||
"Q 1325 4875 1100 4875 \n",
|
||
"Q 925 4875 825 4737 \n",
|
||
"Q 725 4600 625 4300 \n",
|
||
"L 375 4575 \n",
|
||
"Q 525 4925 687 5162 \n",
|
||
"Q 850 5400 1075 5400 \n",
|
||
"Q 1350 5400 1687 5087 \n",
|
||
"Q 2025 4775 2200 4775 \n",
|
||
"Q 2300 4775 2450 4925 \n",
|
||
"Q 2600 5075 2675 5375 \n",
|
||
"L 2925 5050 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-5c81\" d=\"M 2575 3250 \n",
|
||
"Q 2850 3100 3125 3000 \n",
|
||
"Q 2900 2800 2675 2450 \n",
|
||
"L 5400 2450 \n",
|
||
"Q 5175 1775 4925 1287 \n",
|
||
"Q 4675 800 4200 400 \n",
|
||
"Q 3725 0 2962 -225 \n",
|
||
"Q 2200 -450 950 -625 \n",
|
||
"Q 850 -350 625 -125 \n",
|
||
"Q 1775 -25 2350 100 \n",
|
||
"Q 2925 225 3250 375 \n",
|
||
"Q 2925 900 2550 1300 \n",
|
||
"Q 2750 1400 2975 1550 \n",
|
||
"Q 3350 1100 3700 600 \n",
|
||
"Q 4075 850 4337 1212 \n",
|
||
"Q 4600 1575 4750 2025 \n",
|
||
"L 2400 2025 \n",
|
||
"Q 2250 1800 2000 1537 \n",
|
||
"Q 1750 1275 1375 900 \n",
|
||
"Q 1150 1125 925 1225 \n",
|
||
"Q 1300 1475 1800 2037 \n",
|
||
"Q 2300 2600 2575 3250 \n",
|
||
"z\n",
|
||
"M 850 4725 \n",
|
||
"L 1400 4725 \n",
|
||
"Q 1350 4500 1350 3825 \n",
|
||
"L 2975 3825 \n",
|
||
"Q 2975 4775 2950 5125 \n",
|
||
"L 3475 5125 \n",
|
||
"Q 3450 4775 3450 3825 \n",
|
||
"L 5050 3825 \n",
|
||
"Q 5050 4400 5025 4700 \n",
|
||
"L 5550 4700 \n",
|
||
"Q 5525 4375 5525 4100 \n",
|
||
"L 5525 3400 \n",
|
||
"L 875 3400 \n",
|
||
"L 875 4200 \n",
|
||
"Q 875 4425 850 4725 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_2\">\n",
|
||
" <g id=\"line2d_2\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m2b3d741f5c\" x=\"180.245\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_2\">\n",
|
||
" <!-- 45~49岁 -->\n",
|
||
" <g transform=\"translate(162.745 244.085) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-35\" d=\"M 2825 1650 \n",
|
||
"Q 2825 900 2462 475 \n",
|
||
"Q 2100 50 1500 50 \n",
|
||
"Q 975 50 637 400 \n",
|
||
"Q 300 750 275 1350 \n",
|
||
"L 850 1350 \n",
|
||
"Q 850 975 1025 750 \n",
|
||
"Q 1200 525 1525 525 \n",
|
||
"Q 1850 525 2037 800 \n",
|
||
"Q 2225 1075 2225 1650 \n",
|
||
"Q 2225 2150 2062 2387 \n",
|
||
"Q 1900 2625 1625 2625 \n",
|
||
"Q 1400 2625 1237 2525 \n",
|
||
"Q 1075 2425 925 2175 \n",
|
||
"L 425 2175 \n",
|
||
"L 575 4375 \n",
|
||
"L 2725 4375 \n",
|
||
"L 2725 3900 \n",
|
||
"L 1050 3900 \n",
|
||
"L 950 2750 \n",
|
||
"Q 1100 2900 1275 2975 \n",
|
||
"Q 1450 3050 1750 3050 \n",
|
||
"Q 2225 3050 2525 2687 \n",
|
||
"Q 2825 2325 2825 1650 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-39\" d=\"M 2825 2300 \n",
|
||
"Q 2825 1275 2462 662 \n",
|
||
"Q 2100 50 1425 50 \n",
|
||
"Q 950 50 662 400 \n",
|
||
"Q 375 750 375 1175 \n",
|
||
"L 950 1175 \n",
|
||
"Q 950 925 1087 725 \n",
|
||
"Q 1225 525 1450 525 \n",
|
||
"Q 1825 525 2012 950 \n",
|
||
"Q 2200 1375 2250 2200 \n",
|
||
"Q 2125 1925 1900 1775 \n",
|
||
"Q 1675 1625 1400 1625 \n",
|
||
"Q 925 1625 625 1975 \n",
|
||
"Q 325 2325 325 2975 \n",
|
||
"Q 325 3625 625 4025 \n",
|
||
"Q 925 4425 1525 4425 \n",
|
||
"Q 2125 4425 2475 3925 \n",
|
||
"Q 2825 3425 2825 2300 \n",
|
||
"z\n",
|
||
"M 2200 2875 \n",
|
||
"Q 2200 3425 2012 3700 \n",
|
||
"Q 1825 3975 1500 3975 \n",
|
||
"Q 1275 3975 1100 3762 \n",
|
||
"Q 925 3550 925 2975 \n",
|
||
"Q 925 2550 1062 2312 \n",
|
||
"Q 1200 2075 1500 2075 \n",
|
||
"Q 1825 2075 2012 2300 \n",
|
||
"Q 2200 2525 2200 2875 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_3\">\n",
|
||
" <g id=\"line2d_3\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m2b3d741f5c\" x=\"269.525\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_3\">\n",
|
||
" <!-- 35~39岁 -->\n",
|
||
" <g transform=\"translate(252.025 244.085) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-33\" d=\"M 2825 1300 \n",
|
||
"Q 2825 725 2462 387 \n",
|
||
"Q 2100 50 1550 50 \n",
|
||
"Q 1000 50 637 387 \n",
|
||
"Q 275 725 275 1425 \n",
|
||
"L 850 1425 \n",
|
||
"Q 850 950 1037 737 \n",
|
||
"Q 1225 525 1550 525 \n",
|
||
"Q 1875 525 2050 725 \n",
|
||
"Q 2225 925 2225 1350 \n",
|
||
"Q 2225 1700 2037 1900 \n",
|
||
"Q 1850 2100 1375 2100 \n",
|
||
"L 1375 2525 \n",
|
||
"Q 1775 2525 1962 2725 \n",
|
||
"Q 2150 2925 2150 3325 \n",
|
||
"Q 2150 3625 2012 3800 \n",
|
||
"Q 1875 3975 1575 3975 \n",
|
||
"Q 1275 3975 1112 3762 \n",
|
||
"Q 950 3550 925 3150 \n",
|
||
"L 375 3150 \n",
|
||
"Q 425 3725 737 4075 \n",
|
||
"Q 1050 4425 1575 4425 \n",
|
||
"Q 2125 4425 2425 4112 \n",
|
||
"Q 2725 3800 2725 3350 \n",
|
||
"Q 2725 2925 2575 2687 \n",
|
||
"Q 2425 2450 2075 2325 \n",
|
||
"Q 2425 2250 2625 1975 \n",
|
||
"Q 2825 1700 2825 1300 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-33\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-33\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_4\">\n",
|
||
" <g id=\"line2d_4\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m2b3d741f5c\" x=\"358.805\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_4\">\n",
|
||
" <!-- 50~54岁 -->\n",
|
||
" <g transform=\"translate(341.305 244.085) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-35\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"xtick_5\">\n",
|
||
" <g id=\"line2d_5\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#m2b3d741f5c\" x=\"448.085\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_5\">\n",
|
||
" <!-- 55~59岁 -->\n",
|
||
" <g transform=\"translate(430.585 244.085) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-35\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_2\">\n",
|
||
" <g id=\"ytick_1\">\n",
|
||
" <g id=\"line2d_6\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"mb6b8f0854d\" d=\"M 0 0 \n",
|
||
"L -3.5 0 \n",
|
||
"\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </defs>\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"228.96\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_6\">\n",
|
||
" <!-- 0 -->\n",
|
||
" <g transform=\"translate(34.325 232.377969) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-30\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_2\">\n",
|
||
" <g id=\"line2d_7\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"203.906619\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_7\">\n",
|
||
" <!-- 500 -->\n",
|
||
" <g transform=\"translate(24.325 207.324588) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-35\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_3\">\n",
|
||
" <g id=\"line2d_8\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"178.853238\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_8\">\n",
|
||
" <!-- 1000 -->\n",
|
||
" <g transform=\"translate(19.325 182.271207) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-31\" d=\"M 1950 100 \n",
|
||
"L 1375 100 \n",
|
||
"L 1375 3425 \n",
|
||
"L 625 3425 \n",
|
||
"L 625 3725 \n",
|
||
"Q 1075 3725 1325 3900 \n",
|
||
"Q 1575 4075 1650 4425 \n",
|
||
"L 1950 4425 \n",
|
||
"L 1950 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-31\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_4\">\n",
|
||
" <g id=\"line2d_9\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"153.799858\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_9\">\n",
|
||
" <!-- 1500 -->\n",
|
||
" <g transform=\"translate(19.325 157.217826) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-31\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_5\">\n",
|
||
" <g id=\"line2d_10\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"128.746477\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_10\">\n",
|
||
" <!-- 2000 -->\n",
|
||
" <g transform=\"translate(19.325 132.164446) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-32\" d=\"M 2850 100 \n",
|
||
"L 300 100 \n",
|
||
"L 300 500 \n",
|
||
"Q 450 900 712 1237 \n",
|
||
"Q 975 1575 1475 2000 \n",
|
||
"Q 1850 2325 2012 2600 \n",
|
||
"Q 2175 2875 2175 3200 \n",
|
||
"Q 2175 3525 2037 3737 \n",
|
||
"Q 1900 3950 1600 3950 \n",
|
||
"Q 1350 3950 1162 3725 \n",
|
||
"Q 975 3500 975 2925 \n",
|
||
"L 400 2925 \n",
|
||
"Q 425 3650 737 4037 \n",
|
||
"Q 1050 4425 1625 4425 \n",
|
||
"Q 2175 4425 2475 4087 \n",
|
||
"Q 2775 3750 2775 3175 \n",
|
||
"Q 2775 2700 2500 2350 \n",
|
||
"Q 2225 2000 1825 1650 \n",
|
||
"Q 1375 1250 1200 1050 \n",
|
||
"Q 1025 850 875 575 \n",
|
||
"L 2850 575 \n",
|
||
"L 2850 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-32\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_6\">\n",
|
||
" <g id=\"line2d_11\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"103.693096\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_11\">\n",
|
||
" <!-- 2500 -->\n",
|
||
" <g transform=\"translate(19.325 107.111065) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-32\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_7\">\n",
|
||
" <g id=\"line2d_12\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"78.639715\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_12\">\n",
|
||
" <!-- 3000 -->\n",
|
||
" <g transform=\"translate(19.325 82.057684) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-33\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_8\">\n",
|
||
" <g id=\"line2d_13\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"53.586335\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_13\">\n",
|
||
" <!-- 3500 -->\n",
|
||
" <g transform=\"translate(19.325 57.004303) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-33\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"ytick_9\">\n",
|
||
" <g id=\"line2d_14\">\n",
|
||
" <g>\n",
|
||
" <use xlink:href=\"#mb6b8f0854d\" x=\"46.325\" y=\"28.532954\" style=\"stroke: #000000; stroke-width: 0.8\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_14\">\n",
|
||
" <!-- 4000 -->\n",
|
||
" <g transform=\"translate(19.325 31.950922) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_15\">\n",
|
||
" <!-- Count -->\n",
|
||
" <g transform=\"translate(14.035938 130.58) rotate(-90) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-43\" d=\"M 3000 1800 \n",
|
||
"Q 2975 850 2600 450 \n",
|
||
"Q 2225 50 1700 50 \n",
|
||
"Q 1100 50 675 537 \n",
|
||
"Q 250 1025 250 2100 \n",
|
||
"Q 250 3275 662 3850 \n",
|
||
"Q 1075 4425 1725 4425 \n",
|
||
"Q 2275 4425 2637 4012 \n",
|
||
"Q 3000 3600 2975 2850 \n",
|
||
"L 2425 2850 \n",
|
||
"Q 2425 3400 2250 3675 \n",
|
||
"Q 2075 3950 1725 3950 \n",
|
||
"Q 1325 3950 1087 3537 \n",
|
||
"Q 850 3125 850 2150 \n",
|
||
"Q 850 1250 1087 887 \n",
|
||
"Q 1325 525 1700 525 \n",
|
||
"Q 1975 525 2200 787 \n",
|
||
"Q 2425 1050 2425 1800 \n",
|
||
"L 3000 1800 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-6f\" d=\"M 2950 1500 \n",
|
||
"Q 2950 850 2550 450 \n",
|
||
"Q 2150 50 1600 50 \n",
|
||
"Q 1050 50 650 450 \n",
|
||
"Q 250 850 250 1500 \n",
|
||
"Q 250 2150 650 2550 \n",
|
||
"Q 1050 2950 1600 2950 \n",
|
||
"Q 2150 2950 2550 2550 \n",
|
||
"Q 2950 2150 2950 1500 \n",
|
||
"z\n",
|
||
"M 2400 1500 \n",
|
||
"Q 2400 2000 2150 2250 \n",
|
||
"Q 1900 2500 1600 2500 \n",
|
||
"Q 1300 2500 1050 2250 \n",
|
||
"Q 800 2000 800 1500 \n",
|
||
"Q 800 1000 1050 750 \n",
|
||
"Q 1300 500 1600 500 \n",
|
||
"Q 1900 500 2150 750 \n",
|
||
"Q 2400 1000 2400 1500 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-75\" d=\"M 2825 100 \n",
|
||
"L 2325 100 \n",
|
||
"L 2325 625 \n",
|
||
"Q 2125 350 1887 200 \n",
|
||
"Q 1650 50 1275 50 \n",
|
||
"Q 825 50 600 300 \n",
|
||
"Q 375 550 375 925 \n",
|
||
"L 375 2900 \n",
|
||
"L 875 2900 \n",
|
||
"L 875 1100 \n",
|
||
"Q 875 800 1025 625 \n",
|
||
"Q 1175 450 1425 450 \n",
|
||
"Q 1750 450 2037 787 \n",
|
||
"Q 2325 1125 2325 1625 \n",
|
||
"L 2325 2900 \n",
|
||
"L 2825 2900 \n",
|
||
"L 2825 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-6e\" d=\"M 2825 100 \n",
|
||
"L 2325 100 \n",
|
||
"L 2325 1900 \n",
|
||
"Q 2325 2200 2175 2375 \n",
|
||
"Q 2025 2550 1775 2550 \n",
|
||
"Q 1450 2550 1162 2212 \n",
|
||
"Q 875 1875 875 1375 \n",
|
||
"L 875 100 \n",
|
||
"L 375 100 \n",
|
||
"L 375 2900 \n",
|
||
"L 875 2900 \n",
|
||
"L 875 2375 \n",
|
||
"Q 1075 2650 1312 2800 \n",
|
||
"Q 1550 2950 1925 2950 \n",
|
||
"Q 2375 2950 2600 2700 \n",
|
||
"Q 2825 2450 2825 2075 \n",
|
||
"L 2825 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-74\" d=\"M 2775 175 \n",
|
||
"Q 2650 125 2487 87 \n",
|
||
"Q 2325 50 2050 50 \n",
|
||
"Q 1600 50 1325 300 \n",
|
||
"Q 1050 550 1050 1000 \n",
|
||
"L 1050 2500 \n",
|
||
"L 200 2500 \n",
|
||
"L 200 2900 \n",
|
||
"L 1050 2900 \n",
|
||
"L 1050 3875 \n",
|
||
"L 1550 3875 \n",
|
||
"L 1550 2900 \n",
|
||
"L 2575 2900 \n",
|
||
"L 2575 2500 \n",
|
||
"L 1550 2500 \n",
|
||
"L 1550 975 \n",
|
||
"Q 1550 775 1650 637 \n",
|
||
"Q 1750 500 2025 500 \n",
|
||
"Q 2300 500 2475 550 \n",
|
||
"Q 2650 600 2775 675 \n",
|
||
"L 2775 175 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-43\"/>\n",
|
||
" <use xlink:href=\"#SimHei-6f\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-75\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-6e\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-74\" x=\"200\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_8\">\n",
|
||
" <path d=\"M 46.325 228.96 \n",
|
||
"L 46.325 7.2 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_9\">\n",
|
||
" <path d=\"M 492.725 228.96 \n",
|
||
"L 492.725 7.2 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_10\">\n",
|
||
" <path d=\"M 46.325 228.96 \n",
|
||
"L 492.725 228.96 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_11\">\n",
|
||
" <path d=\"M 46.325 7.2 \n",
|
||
"L 492.725 7.2 \n",
|
||
"\" style=\"fill: none; stroke: #000000; stroke-width: 0.8; stroke-linejoin: miter; stroke-linecap: square\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_16\">\n",
|
||
" <!-- 4215 -->\n",
|
||
" <g transform=\"translate(80.965 16.256797) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-32\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-31\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_17\">\n",
|
||
" <!-- 1053 -->\n",
|
||
" <g transform=\"translate(170.245 174.694377) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-31\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-33\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_18\">\n",
|
||
" <!-- 681 -->\n",
|
||
" <g transform=\"translate(262.025 193.334093) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-36\" d=\"M 2850 1550 \n",
|
||
"Q 2850 850 2550 450 \n",
|
||
"Q 2250 50 1650 50 \n",
|
||
"Q 1050 50 700 550 \n",
|
||
"Q 350 1050 350 2175 \n",
|
||
"Q 350 3200 712 3812 \n",
|
||
"Q 1075 4425 1750 4425 \n",
|
||
"Q 2225 4425 2512 4075 \n",
|
||
"Q 2800 3725 2800 3300 \n",
|
||
"L 2225 3300 \n",
|
||
"Q 2225 3550 2087 3750 \n",
|
||
"Q 1950 3950 1725 3950 \n",
|
||
"Q 1350 3950 1150 3562 \n",
|
||
"Q 950 3175 925 2375 \n",
|
||
"Q 1100 2700 1300 2825 \n",
|
||
"Q 1500 2950 1775 2950 \n",
|
||
"Q 2250 2950 2550 2575 \n",
|
||
"Q 2850 2200 2850 1550 \n",
|
||
"z\n",
|
||
"M 2250 1550 \n",
|
||
"Q 2250 2000 2100 2250 \n",
|
||
"Q 1950 2500 1675 2500 \n",
|
||
"Q 1350 2500 1162 2250 \n",
|
||
"Q 975 2000 975 1650 \n",
|
||
"Q 975 1100 1162 800 \n",
|
||
"Q 1350 500 1675 500 \n",
|
||
"Q 1900 500 2075 725 \n",
|
||
"Q 2250 950 2250 1550 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-38\" d=\"M 2875 1325 \n",
|
||
"Q 2875 700 2525 375 \n",
|
||
"Q 2175 50 1575 50 \n",
|
||
"Q 975 50 625 375 \n",
|
||
"Q 275 700 275 1325 \n",
|
||
"Q 275 1650 475 1912 \n",
|
||
"Q 675 2175 1025 2300 \n",
|
||
"Q 725 2425 562 2650 \n",
|
||
"Q 400 2875 400 3225 \n",
|
||
"Q 400 3775 750 4100 \n",
|
||
"Q 1100 4425 1575 4425 \n",
|
||
"Q 2050 4425 2400 4100 \n",
|
||
"Q 2750 3775 2750 3225 \n",
|
||
"Q 2750 2875 2587 2650 \n",
|
||
"Q 2425 2425 2125 2300 \n",
|
||
"Q 2475 2175 2675 1912 \n",
|
||
"Q 2875 1650 2875 1325 \n",
|
||
"z\n",
|
||
"M 2200 3225 \n",
|
||
"Q 2200 3625 2025 3800 \n",
|
||
"Q 1850 3975 1575 3975 \n",
|
||
"Q 1300 3975 1125 3800 \n",
|
||
"Q 950 3625 950 3225 \n",
|
||
"Q 950 2825 1137 2662 \n",
|
||
"Q 1325 2500 1575 2500 \n",
|
||
"Q 1825 2500 2012 2662 \n",
|
||
"Q 2200 2825 2200 3225 \n",
|
||
"z\n",
|
||
"M 2300 1325 \n",
|
||
"Q 2300 1675 2112 1875 \n",
|
||
"Q 1925 2075 1575 2075 \n",
|
||
"Q 1225 2075 1037 1875 \n",
|
||
"Q 850 1675 850 1325 \n",
|
||
"Q 850 925 1050 712 \n",
|
||
"Q 1250 500 1575 500 \n",
|
||
"Q 1900 500 2100 712 \n",
|
||
"Q 2300 925 2300 1325 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-36\"/>\n",
|
||
" <use xlink:href=\"#SimHei-38\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-31\" x=\"100\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_19\">\n",
|
||
" <!-- 34 -->\n",
|
||
" <g transform=\"translate(353.805 225.753167) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-33\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"50\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_20\">\n",
|
||
" <!-- 20 -->\n",
|
||
" <g transform=\"translate(443.085 226.454662) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-32\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <defs>\n",
|
||
" <clipPath id=\"p703aa4411f\">\n",
|
||
" <rect x=\"46.325\" y=\"7.2\" width=\"446.4\" height=\"221.76\"/>\n",
|
||
" </clipPath>\n",
|
||
" </defs>\n",
|
||
" <defs>\n",
|
||
" <pattern id=\"h3de5bce4dd\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
|
||
" <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#006428\"/>\n",
|
||
" <path d=\"M -36 36 \n",
|
||
"L 36 -36 \n",
|
||
"M -30 42 \n",
|
||
"L 42 -30 \n",
|
||
"M -24 48 \n",
|
||
"L 48 -24 \n",
|
||
"M -18 54 \n",
|
||
"L 54 -18 \n",
|
||
"M -12 60 \n",
|
||
"L 60 -12 \n",
|
||
"M -6 66 \n",
|
||
"L 66 -6 \n",
|
||
"M 0 72 \n",
|
||
"L 72 0 \n",
|
||
"M 6 78 \n",
|
||
"L 78 6 \n",
|
||
"M 12 84 \n",
|
||
"L 84 12 \n",
|
||
"M 18 90 \n",
|
||
"L 90 18 \n",
|
||
"M 24 96 \n",
|
||
"L 96 24 \n",
|
||
"M 30 102 \n",
|
||
"L 102 30 \n",
|
||
"M 36 108 \n",
|
||
"L 108 36 \n",
|
||
"\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
|
||
" </pattern>\n",
|
||
" <pattern id=\"h9cf71d2063\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
|
||
" <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#228a44\"/>\n",
|
||
" <path d=\"M -36 36 \n",
|
||
"L 36 -36 \n",
|
||
"M -30 42 \n",
|
||
"L 42 -30 \n",
|
||
"M -24 48 \n",
|
||
"L 48 -24 \n",
|
||
"M -18 54 \n",
|
||
"L 54 -18 \n",
|
||
"M -12 60 \n",
|
||
"L 60 -12 \n",
|
||
"M -6 66 \n",
|
||
"L 66 -6 \n",
|
||
"M 0 72 \n",
|
||
"L 72 0 \n",
|
||
"M 6 78 \n",
|
||
"L 78 6 \n",
|
||
"M 12 84 \n",
|
||
"L 84 12 \n",
|
||
"M 18 90 \n",
|
||
"L 90 18 \n",
|
||
"M 24 96 \n",
|
||
"L 96 24 \n",
|
||
"M 30 102 \n",
|
||
"L 102 30 \n",
|
||
"M 36 108 \n",
|
||
"L 108 36 \n",
|
||
"\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
|
||
" </pattern>\n",
|
||
" <pattern id=\"h43492deada\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
|
||
" <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#4bb062\"/>\n",
|
||
" <path d=\"M -36 36 \n",
|
||
"L 36 -36 \n",
|
||
"M -30 42 \n",
|
||
"L 42 -30 \n",
|
||
"M -24 48 \n",
|
||
"L 48 -24 \n",
|
||
"M -18 54 \n",
|
||
"L 54 -18 \n",
|
||
"M -12 60 \n",
|
||
"L 60 -12 \n",
|
||
"M -6 66 \n",
|
||
"L 66 -6 \n",
|
||
"M 0 72 \n",
|
||
"L 72 0 \n",
|
||
"M 6 78 \n",
|
||
"L 78 6 \n",
|
||
"M 12 84 \n",
|
||
"L 84 12 \n",
|
||
"M 18 90 \n",
|
||
"L 90 18 \n",
|
||
"M 24 96 \n",
|
||
"L 96 24 \n",
|
||
"M 30 102 \n",
|
||
"L 102 30 \n",
|
||
"M 36 108 \n",
|
||
"L 108 36 \n",
|
||
"\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
|
||
" </pattern>\n",
|
||
" <pattern id=\"h6d0bdebefd\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
|
||
" <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#86cc85\"/>\n",
|
||
" <path d=\"M -36 36 \n",
|
||
"L 36 -36 \n",
|
||
"M -30 42 \n",
|
||
"L 42 -30 \n",
|
||
"M -24 48 \n",
|
||
"L 48 -24 \n",
|
||
"M -18 54 \n",
|
||
"L 54 -18 \n",
|
||
"M -12 60 \n",
|
||
"L 60 -12 \n",
|
||
"M -6 66 \n",
|
||
"L 66 -6 \n",
|
||
"M 0 72 \n",
|
||
"L 72 0 \n",
|
||
"M 6 78 \n",
|
||
"L 78 6 \n",
|
||
"M 12 84 \n",
|
||
"L 84 12 \n",
|
||
"M 18 90 \n",
|
||
"L 90 18 \n",
|
||
"M 24 96 \n",
|
||
"L 96 24 \n",
|
||
"M 30 102 \n",
|
||
"L 102 30 \n",
|
||
"M 36 108 \n",
|
||
"L 108 36 \n",
|
||
"\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
|
||
" </pattern>\n",
|
||
" <pattern id=\"h6b47a2d288\" patternUnits=\"userSpaceOnUse\" x=\"0\" y=\"0\" width=\"72\" height=\"72\">\n",
|
||
" <rect x=\"0\" y=\"0\" width=\"73\" height=\"73\" fill=\"#b8e3b2\"/>\n",
|
||
" <path d=\"M -36 36 \n",
|
||
"L 36 -36 \n",
|
||
"M -30 42 \n",
|
||
"L 42 -30 \n",
|
||
"M -24 48 \n",
|
||
"L 48 -24 \n",
|
||
"M -18 54 \n",
|
||
"L 54 -18 \n",
|
||
"M -12 60 \n",
|
||
"L 60 -12 \n",
|
||
"M -6 66 \n",
|
||
"L 66 -6 \n",
|
||
"M 0 72 \n",
|
||
"L 72 0 \n",
|
||
"M 6 78 \n",
|
||
"L 78 6 \n",
|
||
"M 12 84 \n",
|
||
"L 84 12 \n",
|
||
"M 18 90 \n",
|
||
"L 90 18 \n",
|
||
"M 24 96 \n",
|
||
"L 96 24 \n",
|
||
"M 30 102 \n",
|
||
"L 102 30 \n",
|
||
"M 36 108 \n",
|
||
"L 108 36 \n",
|
||
"\" style=\"fill: #000000; stroke: #000000; stroke-width: 1.0; stroke-linecap: butt; stroke-linejoin: miter\"/>\n",
|
||
" </pattern>\n",
|
||
" </defs>\n",
|
||
"</svg>\n"
|
||
],
|
||
"text/plain": [
|
||
"<Figure size 800x400 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 绘制柱状图\n",
|
||
"temp.plot(\n",
|
||
" kind='bar', # 图表类型\n",
|
||
" figsize=(8, 4), # 图表尺寸\n",
|
||
" xlabel='', # 横轴标签\n",
|
||
" ylabel='Count', # 纵轴标签\n",
|
||
" width=0.5, # 柱子宽度\n",
|
||
" hatch='//', # 柱子条纹\n",
|
||
" color=plt.cm.Greens(np.linspace(0.9, 0.3, temp.size)) # 颜色值\n",
|
||
")\n",
|
||
"\n",
|
||
"for i in range(temp.size):\n",
|
||
" # plt.text(横坐标, 纵坐标, 标签内容)\n",
|
||
" plt.text(i, temp.iloc[i] + 30, temp.iloc[i], ha='center')\n",
|
||
"\n",
|
||
"# 定制横轴的刻度\n",
|
||
"plt.xticks(rotation=0)\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 114,
|
||
"id": "e020ba6c-d16d-482f-ad3b-a9e855257b91",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/svg+xml": [
|
||
"<?xml version=\"1.0\" encoding=\"utf-8\" standalone=\"no\"?>\n",
|
||
"<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
|
||
" \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
|
||
"<svg xmlns:xlink=\"http://www.w3.org/1999/xlink\" width=\"299.538866pt\" height=\"280.512pt\" viewBox=\"0 0 299.538866 280.512\" xmlns=\"http://www.w3.org/2000/svg\" version=\"1.1\">\n",
|
||
" <metadata>\n",
|
||
" <rdf:RDF xmlns:dc=\"http://purl.org/dc/elements/1.1/\" xmlns:cc=\"http://creativecommons.org/ns#\" xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\">\n",
|
||
" <cc:Work>\n",
|
||
" <dc:type rdf:resource=\"http://purl.org/dc/dcmitype/StillImage\"/>\n",
|
||
" <dc:date>2025-03-04T16:47:31.053333</dc:date>\n",
|
||
" <dc:format>image/svg+xml</dc:format>\n",
|
||
" <dc:creator>\n",
|
||
" <cc:Agent>\n",
|
||
" <dc:title>Matplotlib v3.9.4, https://matplotlib.org/</dc:title>\n",
|
||
" </cc:Agent>\n",
|
||
" </dc:creator>\n",
|
||
" </cc:Work>\n",
|
||
" </rdf:RDF>\n",
|
||
" </metadata>\n",
|
||
" <defs>\n",
|
||
" <style type=\"text/css\">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n",
|
||
" </defs>\n",
|
||
" <g id=\"figure_1\">\n",
|
||
" <g id=\"patch_1\">\n",
|
||
" <path d=\"M 0 280.512 \n",
|
||
"L 299.538866 280.512 \n",
|
||
"L 299.538866 0 \n",
|
||
"L 0 0 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ffffff\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"axes_1\">\n",
|
||
" <g id=\"patch_2\">\n",
|
||
" <path d=\"M 246.7008 140.256 \n",
|
||
"C 246.7008 120.565973 241.237445 101.256765 230.921114 84.485645 \n",
|
||
"C 220.604784 67.714525 205.834526 54.130223 188.26051 45.250408 \n",
|
||
"C 170.686494 36.370594 150.988505 32.53875 131.36725 34.18298 \n",
|
||
"C 111.745995 35.82721 92.960449 42.883912 77.109379 54.564696 \n",
|
||
"C 61.258308 66.245479 48.954852 82.098515 41.573583 100.352666 \n",
|
||
"C 34.192314 118.606816 32.018749 138.555987 35.295723 157.971408 \n",
|
||
"C 38.572697 177.386829 47.173453 195.517488 60.137094 210.337783 \n",
|
||
"C 73.100735 225.158079 89.92581 236.094743 108.732583 241.925905 \n",
|
||
"L 118.189608 211.424934 \n",
|
||
"C 105.024867 207.34312 93.247314 199.687455 84.172766 189.313248 \n",
|
||
"C 75.098217 178.939041 69.077688 166.24758 66.783806 152.656786 \n",
|
||
"C 64.489924 139.065991 66.01142 125.101571 71.178308 112.323666 \n",
|
||
"C 76.345197 99.545761 84.957616 88.448635 96.053365 80.272087 \n",
|
||
"C 107.149115 72.095539 120.298997 67.155847 134.033875 66.004886 \n",
|
||
"C 147.768754 64.853925 161.557346 67.536216 173.859157 73.752086 \n",
|
||
"C 186.160968 79.967956 196.500148 89.476968 203.72158 101.216752 \n",
|
||
"C 210.943012 112.956535 214.76736 126.472981 214.76736 140.256 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #1f77b4\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_3\">\n",
|
||
" <path d=\"M 108.732583 241.925905 \n",
|
||
"C 127.525804 247.752866 147.571709 248.25653 166.633844 243.380705 \n",
|
||
"C 185.695979 238.50488 203.038062 228.437896 216.725295 214.302894 \n",
|
||
"L 193.784506 192.088825 \n",
|
||
"C 184.203443 201.983327 172.063985 209.030216 158.720491 212.443294 \n",
|
||
"C 145.376997 215.856371 131.344863 215.503806 118.189608 211.424934 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #ff7f0e\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_4\">\n",
|
||
" <path d=\"M 216.725295 214.302894 \n",
|
||
"C 225.545093 205.19456 232.667012 194.583258 237.754729 182.970088 \n",
|
||
"C 242.842446 171.356917 245.814596 158.927606 246.530823 146.269106 \n",
|
||
"L 214.648376 144.465174 \n",
|
||
"C 214.147017 153.326124 212.066512 162.026642 208.50511 170.155861 \n",
|
||
"C 204.943708 178.285081 199.958365 185.712992 193.784506 192.088825 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #2ca02c\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_5\">\n",
|
||
" <path d=\"M 246.530823 146.269106 \n",
|
||
"C 246.566488 145.63877 246.596544 145.008129 246.620988 144.377257 \n",
|
||
"C 246.645432 143.746386 246.664263 143.11531 246.677478 142.484103 \n",
|
||
"L 214.751035 141.815672 \n",
|
||
"C 214.741784 142.257517 214.728602 142.69927 214.711492 143.14088 \n",
|
||
"C 214.694381 143.58249 214.673342 144.023939 214.648376 144.465174 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #d62728\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"patch_6\">\n",
|
||
" <path d=\"M 246.677478 142.484103 \n",
|
||
"C 246.685252 142.112807 246.691082 141.741473 246.694969 141.370115 \n",
|
||
"C 246.698856 140.998758 246.7008 140.627383 246.7008 140.256005 \n",
|
||
"L 214.76736 140.256004 \n",
|
||
"C 214.76736 140.515968 214.765999 140.775931 214.763279 141.035881 \n",
|
||
"C 214.760558 141.295831 214.756476 141.555765 214.751035 141.815672 \n",
|
||
"z\n",
|
||
"\" style=\"fill: #9467bd\"/>\n",
|
||
" </g>\n",
|
||
" <g id=\"matplotlib.axis_1\"/>\n",
|
||
" <g id=\"matplotlib.axis_2\"/>\n",
|
||
" <g id=\"text_1\">\n",
|
||
" <!-- 40~44岁 -->\n",
|
||
" <g transform=\"translate(35.794716 49.413534) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-34\" d=\"M 2975 1200 \n",
|
||
"L 2450 1200 \n",
|
||
"L 2450 100 \n",
|
||
"L 1875 100 \n",
|
||
"L 1875 1200 \n",
|
||
"L 200 1200 \n",
|
||
"L 200 1675 \n",
|
||
"L 1875 4425 \n",
|
||
"L 2450 4425 \n",
|
||
"L 2450 1675 \n",
|
||
"L 2975 1675 \n",
|
||
"L 2975 1200 \n",
|
||
"z\n",
|
||
"M 1875 1675 \n",
|
||
"L 1875 3525 \n",
|
||
"L 750 1675 \n",
|
||
"L 1875 1675 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-30\" d=\"M 2975 2250 \n",
|
||
"Q 2975 1350 2650 700 \n",
|
||
"Q 2325 50 1600 50 \n",
|
||
"Q 875 50 537 700 \n",
|
||
"Q 200 1350 200 2250 \n",
|
||
"Q 200 3150 537 3787 \n",
|
||
"Q 875 4425 1600 4425 \n",
|
||
"Q 2325 4425 2650 3787 \n",
|
||
"Q 2975 3150 2975 2250 \n",
|
||
"z\n",
|
||
"M 2375 2250 \n",
|
||
"Q 2375 3050 2187 3500 \n",
|
||
"Q 2000 3950 1600 3950 \n",
|
||
"Q 1200 3950 1000 3500 \n",
|
||
"Q 800 3050 800 2250 \n",
|
||
"Q 800 1450 1000 987 \n",
|
||
"Q 1200 525 1600 525 \n",
|
||
"Q 2000 525 2187 987 \n",
|
||
"Q 2375 1450 2375 2250 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-7e\" d=\"M 2925 5050 \n",
|
||
"Q 2775 4700 2587 4462 \n",
|
||
"Q 2400 4225 2150 4225 \n",
|
||
"Q 1975 4225 1650 4550 \n",
|
||
"Q 1325 4875 1100 4875 \n",
|
||
"Q 925 4875 825 4737 \n",
|
||
"Q 725 4600 625 4300 \n",
|
||
"L 375 4575 \n",
|
||
"Q 525 4925 687 5162 \n",
|
||
"Q 850 5400 1075 5400 \n",
|
||
"Q 1350 5400 1687 5087 \n",
|
||
"Q 2025 4775 2200 4775 \n",
|
||
"Q 2300 4775 2450 4925 \n",
|
||
"Q 2600 5075 2675 5375 \n",
|
||
"L 2925 5050 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-5c81\" d=\"M 2575 3250 \n",
|
||
"Q 2850 3100 3125 3000 \n",
|
||
"Q 2900 2800 2675 2450 \n",
|
||
"L 5400 2450 \n",
|
||
"Q 5175 1775 4925 1287 \n",
|
||
"Q 4675 800 4200 400 \n",
|
||
"Q 3725 0 2962 -225 \n",
|
||
"Q 2200 -450 950 -625 \n",
|
||
"Q 850 -350 625 -125 \n",
|
||
"Q 1775 -25 2350 100 \n",
|
||
"Q 2925 225 3250 375 \n",
|
||
"Q 2925 900 2550 1300 \n",
|
||
"Q 2750 1400 2975 1550 \n",
|
||
"Q 3350 1100 3700 600 \n",
|
||
"Q 4075 850 4337 1212 \n",
|
||
"Q 4600 1575 4750 2025 \n",
|
||
"L 2400 2025 \n",
|
||
"Q 2250 1800 2000 1537 \n",
|
||
"Q 1750 1275 1375 900 \n",
|
||
"Q 1150 1125 925 1225 \n",
|
||
"Q 1300 1475 1800 2037 \n",
|
||
"Q 2300 2600 2575 3250 \n",
|
||
"z\n",
|
||
"M 850 4725 \n",
|
||
"L 1400 4725 \n",
|
||
"Q 1350 4500 1350 3825 \n",
|
||
"L 2975 3825 \n",
|
||
"Q 2975 4775 2950 5125 \n",
|
||
"L 3475 5125 \n",
|
||
"Q 3450 4775 3450 3825 \n",
|
||
"L 5050 3825 \n",
|
||
"Q 5050 4400 5025 4700 \n",
|
||
"L 5550 4700 \n",
|
||
"Q 5525 4375 5525 4100 \n",
|
||
"L 5525 3400 \n",
|
||
"L 875 3400 \n",
|
||
"L 875 4200 \n",
|
||
"Q 875 4425 850 4725 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_2\">\n",
|
||
" <!-- 70.2% -->\n",
|
||
" <g transform=\"translate(74.081372 70.191829) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-37\" d=\"M 2775 3850 \n",
|
||
"L 1600 100 \n",
|
||
"L 1025 100 \n",
|
||
"L 2225 3900 \n",
|
||
"L 400 3900 \n",
|
||
"L 400 4375 \n",
|
||
"L 2775 4375 \n",
|
||
"L 2775 3850 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-2e\" d=\"M 1100 100 \n",
|
||
"L 525 100 \n",
|
||
"L 525 650 \n",
|
||
"L 1100 650 \n",
|
||
"L 1100 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-32\" d=\"M 2850 100 \n",
|
||
"L 300 100 \n",
|
||
"L 300 500 \n",
|
||
"Q 450 900 712 1237 \n",
|
||
"Q 975 1575 1475 2000 \n",
|
||
"Q 1850 2325 2012 2600 \n",
|
||
"Q 2175 2875 2175 3200 \n",
|
||
"Q 2175 3525 2037 3737 \n",
|
||
"Q 1900 3950 1600 3950 \n",
|
||
"Q 1350 3950 1162 3725 \n",
|
||
"Q 975 3500 975 2925 \n",
|
||
"L 400 2925 \n",
|
||
"Q 425 3650 737 4037 \n",
|
||
"Q 1050 4425 1625 4425 \n",
|
||
"Q 2175 4425 2475 4087 \n",
|
||
"Q 2775 3750 2775 3175 \n",
|
||
"Q 2775 2700 2500 2350 \n",
|
||
"Q 2225 2000 1825 1650 \n",
|
||
"Q 1375 1250 1200 1050 \n",
|
||
"Q 1025 850 875 575 \n",
|
||
"L 2850 575 \n",
|
||
"L 2850 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-25\" d=\"M 1425 3300 \n",
|
||
"Q 1425 2575 1212 2375 \n",
|
||
"Q 1000 2175 800 2175 \n",
|
||
"Q 600 2175 387 2375 \n",
|
||
"Q 175 2575 175 3300 \n",
|
||
"Q 175 4025 387 4225 \n",
|
||
"Q 600 4425 800 4425 \n",
|
||
"Q 1000 4425 1212 4225 \n",
|
||
"Q 1425 4025 1425 3300 \n",
|
||
"z\n",
|
||
"M 2650 4350 \n",
|
||
"L 725 50 \n",
|
||
"L 525 125 \n",
|
||
"L 2450 4425 \n",
|
||
"L 2650 4350 \n",
|
||
"z\n",
|
||
"M 3000 1175 \n",
|
||
"Q 3000 450 2787 250 \n",
|
||
"Q 2575 50 2375 50 \n",
|
||
"Q 2175 50 1962 250 \n",
|
||
"Q 1750 450 1750 1175 \n",
|
||
"Q 1750 1900 1962 2100 \n",
|
||
"Q 2175 2300 2375 2300 \n",
|
||
"Q 2575 2300 2787 2100 \n",
|
||
"Q 3000 1900 3000 1175 \n",
|
||
"z\n",
|
||
"M 1025 3300 \n",
|
||
"Q 1025 3750 975 3900 \n",
|
||
"Q 925 4050 800 4050 \n",
|
||
"Q 675 4050 625 3900 \n",
|
||
"Q 575 3750 575 3300 \n",
|
||
"Q 575 2850 625 2700 \n",
|
||
"Q 675 2550 800 2550 \n",
|
||
"Q 925 2550 975 2700 \n",
|
||
"Q 1025 2850 1025 3300 \n",
|
||
"z\n",
|
||
"M 2600 1175 \n",
|
||
"Q 2600 1625 2550 1775 \n",
|
||
"Q 2500 1925 2375 1925 \n",
|
||
"Q 2250 1925 2200 1775 \n",
|
||
"Q 2150 1625 2150 1175 \n",
|
||
"Q 2150 725 2200 575 \n",
|
||
"Q 2250 425 2375 425 \n",
|
||
"Q 2500 425 2550 575 \n",
|
||
"Q 2600 725 2600 1175 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-37\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-2e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-32\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-25\" x=\"200\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_3\">\n",
|
||
" <!-- 45~49岁 -->\n",
|
||
" <g transform=\"translate(169.271628 257.111144) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-35\" d=\"M 2825 1650 \n",
|
||
"Q 2825 900 2462 475 \n",
|
||
"Q 2100 50 1500 50 \n",
|
||
"Q 975 50 637 400 \n",
|
||
"Q 300 750 275 1350 \n",
|
||
"L 850 1350 \n",
|
||
"Q 850 975 1025 750 \n",
|
||
"Q 1200 525 1525 525 \n",
|
||
"Q 1850 525 2037 800 \n",
|
||
"Q 2225 1075 2225 1650 \n",
|
||
"Q 2225 2150 2062 2387 \n",
|
||
"Q 1900 2625 1625 2625 \n",
|
||
"Q 1400 2625 1237 2525 \n",
|
||
"Q 1075 2425 925 2175 \n",
|
||
"L 425 2175 \n",
|
||
"L 575 4375 \n",
|
||
"L 2725 4375 \n",
|
||
"L 2725 3900 \n",
|
||
"L 1050 3900 \n",
|
||
"L 950 2750 \n",
|
||
"Q 1100 2900 1275 2975 \n",
|
||
"Q 1450 3050 1750 3050 \n",
|
||
"Q 2225 3050 2525 2687 \n",
|
||
"Q 2825 2325 2825 1650 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" <path id=\"SimHei-39\" d=\"M 2825 2300 \n",
|
||
"Q 2825 1275 2462 662 \n",
|
||
"Q 2100 50 1425 50 \n",
|
||
"Q 950 50 662 400 \n",
|
||
"Q 375 750 375 1175 \n",
|
||
"L 950 1175 \n",
|
||
"Q 950 925 1087 725 \n",
|
||
"Q 1225 525 1450 525 \n",
|
||
"Q 1825 525 2012 950 \n",
|
||
"Q 2200 1375 2250 2200 \n",
|
||
"Q 2125 1925 1900 1775 \n",
|
||
"Q 1675 1625 1400 1625 \n",
|
||
"Q 925 1625 625 1975 \n",
|
||
"Q 325 2325 325 2975 \n",
|
||
"Q 325 3625 625 4025 \n",
|
||
"Q 925 4425 1525 4425 \n",
|
||
"Q 2125 4425 2475 3925 \n",
|
||
"Q 2825 3425 2825 2300 \n",
|
||
"z\n",
|
||
"M 2200 2875 \n",
|
||
"Q 2200 3425 2012 3700 \n",
|
||
"Q 1825 3975 1500 3975 \n",
|
||
"Q 1275 3975 1100 3762 \n",
|
||
"Q 925 3550 925 2975 \n",
|
||
"Q 925 2550 1062 2312 \n",
|
||
"Q 1200 2075 1500 2075 \n",
|
||
"Q 1825 2075 2012 2300 \n",
|
||
"Q 2200 2525 2200 2875 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-34\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_4\">\n",
|
||
" <!-- 17.5% -->\n",
|
||
" <g transform=\"translate(150.177167 230.685437) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-31\" d=\"M 1950 100 \n",
|
||
"L 1375 100 \n",
|
||
"L 1375 3425 \n",
|
||
"L 625 3425 \n",
|
||
"L 625 3725 \n",
|
||
"Q 1075 3725 1325 3900 \n",
|
||
"Q 1575 4075 1650 4425 \n",
|
||
"L 1950 4425 \n",
|
||
"L 1950 100 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-31\"/>\n",
|
||
" <use xlink:href=\"#SimHei-37\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-2e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-25\" x=\"200\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_5\">\n",
|
||
" <!-- 35~39岁 -->\n",
|
||
" <g transform=\"translate(247.504602 190.659465) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-33\" d=\"M 2825 1300 \n",
|
||
"Q 2825 725 2462 387 \n",
|
||
"Q 2100 50 1550 50 \n",
|
||
"Q 1000 50 637 387 \n",
|
||
"Q 275 725 275 1425 \n",
|
||
"L 850 1425 \n",
|
||
"Q 850 950 1037 737 \n",
|
||
"Q 1225 525 1550 525 \n",
|
||
"Q 1875 525 2050 725 \n",
|
||
"Q 2225 925 2225 1350 \n",
|
||
"Q 2225 1700 2037 1900 \n",
|
||
"Q 1850 2100 1375 2100 \n",
|
||
"L 1375 2525 \n",
|
||
"Q 1775 2525 1962 2725 \n",
|
||
"Q 2150 2925 2150 3325 \n",
|
||
"Q 2150 3625 2012 3800 \n",
|
||
"Q 1875 3975 1575 3975 \n",
|
||
"Q 1275 3975 1112 3762 \n",
|
||
"Q 950 3550 925 3150 \n",
|
||
"L 375 3150 \n",
|
||
"Q 425 3725 737 4075 \n",
|
||
"Q 1050 4425 1575 4425 \n",
|
||
"Q 2125 4425 2425 4112 \n",
|
||
"Q 2725 3800 2725 3350 \n",
|
||
"Q 2725 2925 2575 2687 \n",
|
||
"Q 2425 2450 2075 2325 \n",
|
||
"Q 2425 2250 2625 1975 \n",
|
||
"Q 2825 1700 2825 1300 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-33\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-33\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_6\">\n",
|
||
" <!-- 11.3% -->\n",
|
||
" <g transform=\"translate(210.62992 179.336412) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-31\"/>\n",
|
||
" <use xlink:href=\"#SimHei-31\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-2e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-33\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-25\" x=\"200\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_7\">\n",
|
||
" <!-- 50~54岁 -->\n",
|
||
" <g transform=\"translate(257.257487 148.207352) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-35\"/>\n",
|
||
" <use xlink:href=\"#SimHei-30\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-34\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_8\">\n",
|
||
" <!-- 0.6% -->\n",
|
||
" <g transform=\"translate(220.66624 146.532506) scale(0.1 -0.1)\">\n",
|
||
" <defs>\n",
|
||
" <path id=\"SimHei-36\" d=\"M 2850 1550 \n",
|
||
"Q 2850 850 2550 450 \n",
|
||
"Q 2250 50 1650 50 \n",
|
||
"Q 1050 50 700 550 \n",
|
||
"Q 350 1050 350 2175 \n",
|
||
"Q 350 3200 712 3812 \n",
|
||
"Q 1075 4425 1750 4425 \n",
|
||
"Q 2225 4425 2512 4075 \n",
|
||
"Q 2800 3725 2800 3300 \n",
|
||
"L 2225 3300 \n",
|
||
"Q 2225 3550 2087 3750 \n",
|
||
"Q 1950 3950 1725 3950 \n",
|
||
"Q 1350 3950 1150 3562 \n",
|
||
"Q 950 3175 925 2375 \n",
|
||
"Q 1100 2700 1300 2825 \n",
|
||
"Q 1500 2950 1775 2950 \n",
|
||
"Q 2250 2950 2550 2575 \n",
|
||
"Q 2850 2200 2850 1550 \n",
|
||
"z\n",
|
||
"M 2250 1550 \n",
|
||
"Q 2250 2000 2100 2250 \n",
|
||
"Q 1950 2500 1675 2500 \n",
|
||
"Q 1350 2500 1162 2250 \n",
|
||
"Q 975 2000 975 1650 \n",
|
||
"Q 975 1100 1162 800 \n",
|
||
"Q 1350 500 1675 500 \n",
|
||
"Q 1900 500 2075 725 \n",
|
||
"Q 2250 950 2250 1550 \n",
|
||
"z\n",
|
||
"\" transform=\"scale(0.015625)\"/>\n",
|
||
" </defs>\n",
|
||
" <use xlink:href=\"#SimHei-30\"/>\n",
|
||
" <use xlink:href=\"#SimHei-2e\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-36\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-25\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_9\">\n",
|
||
" <!-- 55~59岁 -->\n",
|
||
" <g transform=\"translate(257.338866 144.899496) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-35\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-7e\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-35\" x=\"150\"/>\n",
|
||
" <use xlink:href=\"#SimHei-39\" x=\"200\"/>\n",
|
||
" <use xlink:href=\"#SimHei-5c81\" x=\"250\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" <g id=\"text_10\">\n",
|
||
" <!-- 0.3% -->\n",
|
||
" <g transform=\"translate(220.729124 143.976436) scale(0.1 -0.1)\">\n",
|
||
" <use xlink:href=\"#SimHei-30\"/>\n",
|
||
" <use xlink:href=\"#SimHei-2e\" x=\"50\"/>\n",
|
||
" <use xlink:href=\"#SimHei-33\" x=\"100\"/>\n",
|
||
" <use xlink:href=\"#SimHei-25\" x=\"150\"/>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
" </g>\n",
|
||
"</svg>\n"
|
||
],
|
||
"text/plain": [
|
||
"<Figure size 640x480 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 绘制饼图\n",
|
||
"temp.plot(\n",
|
||
" kind='pie',\n",
|
||
" ylabel='',\n",
|
||
" autopct='%.1f%%', # 自动计算并显示百分比\n",
|
||
" wedgeprops={'width': 0.3}, # 环状结构部分的宽度\n",
|
||
" pctdistance=0.85, # 百分比到圆心的距离\n",
|
||
" labeldistance=1.1, # 标签到圆心的距离\n",
|
||
" # shadow=True, # 阴影效果\n",
|
||
" # startangle=0, # 起始角度\n",
|
||
" counterclock=True, # 是否反时针方向绘制\n",
|
||
")\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 115,
|
||
"id": "e846eec2-6c95-409c-8b15-2b14cab3f57c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"mean 111.849640\n",
|
||
"max 140.050000\n",
|
||
"min 109.920000\n",
|
||
"std 2.481941\n",
|
||
"skew 3.485351\n",
|
||
"kurt 17.390027\n",
|
||
"Name: 积分分值, dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 115,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# agg - aggregate - 聚合\n",
|
||
"settle_df.积分分值.agg(['mean', 'max', 'min', 'std', 'skew', 'kurt'])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b1669102-1c03-4751-813c-b241a05718e3",
|
||
"metadata": {},
|
||
"source": [
|
||
"线性归一化:\n",
|
||
"$$\n",
|
||
"x^{\\prime} = \\frac{x - x_{min}}{x_{max} - x_{min}}\n",
|
||
"$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 116,
|
||
"id": "e8d9dca7-b976-43ab-96b8-abefca66cc53",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(140.05, 109.92)"
|
||
]
|
||
},
|
||
"execution_count": 116,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# 将积分分值处理成0~1范围的值\n",
|
||
"max_score, min_score = settle_df.积分分值.agg(['max', 'min'])\n",
|
||
"max_score, min_score"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 117,
|
||
"id": "10acd550-8422-4934-b38f-03554f86d305",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" <th>年龄</th>\n",
|
||
" <th>年龄段</th>\n",
|
||
" <th>线性归一化积分</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>202300001</th>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02-01</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" <td>45</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>1.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300002</th>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02-01</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.81</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300003</th>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" <td>48</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>0.79</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300004</th>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>0.78</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300005</th>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06-01</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.75</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202305999</th>\n",
|
||
" <td>曹恰</td>\n",
|
||
" <td>1983-09-01</td>\n",
|
||
" <td>首都师范大学科德学院</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>35~39岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306000</th>\n",
|
||
" <td>罗佳</td>\n",
|
||
" <td>1981-05-01</td>\n",
|
||
" <td>厦门方胜众合企业服务有限公司海淀分公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306001</th>\n",
|
||
" <td>席盛代</td>\n",
|
||
" <td>1983-06-01</td>\n",
|
||
" <td>中国华能集团清洁能源技术研究院有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>35~39岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306002</th>\n",
|
||
" <td>彭芸芸</td>\n",
|
||
" <td>1981-09-01</td>\n",
|
||
" <td>北京汉杰凯德文化传播有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306003</th>\n",
|
||
" <td>张越</td>\n",
|
||
" <td>1982-01-01</td>\n",
|
||
" <td>大爱城投资控股有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6003 rows × 7 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 姓名 出生年月 单位名称 积分分值 年龄 年龄段 线性归一化积分\n",
|
||
"公示编号 \n",
|
||
"202300001 张浩 1977-02-01 北京首钢股份有限公司 140.05 45 45~49岁 1.00\n",
|
||
"202300002 冯云 1982-02-01 中国人民解放军空军二十三厂 134.29 40 40~44岁 0.81\n",
|
||
"202300003 王天东 1975-01-01 中建二局第三建筑工程有限公司 133.63 48 45~49岁 0.79\n",
|
||
"202300004 陈军 1976-07-01 中建二局第三建筑工程有限公司 133.29 46 45~49岁 0.78\n",
|
||
"202300005 樊海瑞 1981-06-01 中国民生银行股份有限公司 132.46 41 40~44岁 0.75\n",
|
||
"... ... ... ... ... .. ... ...\n",
|
||
"202305999 曹恰 1983-09-01 首都师范大学科德学院 109.92 39 35~39岁 0.00\n",
|
||
"202306000 罗佳 1981-05-01 厦门方胜众合企业服务有限公司海淀分公司 109.92 41 40~44岁 0.00\n",
|
||
"202306001 席盛代 1983-06-01 中国华能集团清洁能源技术研究院有限公司 109.92 39 35~39岁 0.00\n",
|
||
"202306002 彭芸芸 1981-09-01 北京汉杰凯德文化传播有限公司 109.92 41 40~44岁 0.00\n",
|
||
"202306003 张越 1982-01-01 大爱城投资控股有限公司 109.92 41 40~44岁 0.00\n",
|
||
"\n",
|
||
"[6003 rows x 7 columns]"
|
||
]
|
||
},
|
||
"execution_count": 117,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# map - 映射 - 将指定的函数作用到数据系列的每个元素上\n",
|
||
"# apply - 应用 - 将指定的函数应用到数据系列的每个元素上\n",
|
||
"settle_df['线性归一化积分'] = settle_df.积分分值.map(lambda x: (x - min_score) / (max_score - min_score)).round(2)\n",
|
||
"settle_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "55e57b00-cb9e-4c9e-bc59-e99b738e2f5d",
|
||
"metadata": {},
|
||
"source": [
|
||
"zscore标准化:\n",
|
||
"$$\n",
|
||
"x^{\\prime} = \\frac{x - \\mu}{\\sigma}\n",
|
||
"$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 118,
|
||
"id": "b5fc6260-5337-4161-99f0-d7be43d59361",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>姓名</th>\n",
|
||
" <th>出生年月</th>\n",
|
||
" <th>单位名称</th>\n",
|
||
" <th>积分分值</th>\n",
|
||
" <th>年龄</th>\n",
|
||
" <th>年龄段</th>\n",
|
||
" <th>线性归一化积分</th>\n",
|
||
" <th>zscore评分</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>公示编号</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>202300001</th>\n",
|
||
" <td>张浩</td>\n",
|
||
" <td>1977-02-01</td>\n",
|
||
" <td>北京首钢股份有限公司</td>\n",
|
||
" <td>140.05</td>\n",
|
||
" <td>45</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>1.00</td>\n",
|
||
" <td>11.362219</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300002</th>\n",
|
||
" <td>冯云</td>\n",
|
||
" <td>1982-02-01</td>\n",
|
||
" <td>中国人民解放军空军二十三厂</td>\n",
|
||
" <td>134.29</td>\n",
|
||
" <td>40</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.81</td>\n",
|
||
" <td>9.041455</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300003</th>\n",
|
||
" <td>王天东</td>\n",
|
||
" <td>1975-01-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.63</td>\n",
|
||
" <td>48</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>0.79</td>\n",
|
||
" <td>8.775534</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300004</th>\n",
|
||
" <td>陈军</td>\n",
|
||
" <td>1976-07-01</td>\n",
|
||
" <td>中建二局第三建筑工程有限公司</td>\n",
|
||
" <td>133.29</td>\n",
|
||
" <td>46</td>\n",
|
||
" <td>45~49岁</td>\n",
|
||
" <td>0.78</td>\n",
|
||
" <td>8.638545</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202300005</th>\n",
|
||
" <td>樊海瑞</td>\n",
|
||
" <td>1981-06-01</td>\n",
|
||
" <td>中国民生银行股份有限公司</td>\n",
|
||
" <td>132.46</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.75</td>\n",
|
||
" <td>8.304129</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>...</th>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" <td>...</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202305999</th>\n",
|
||
" <td>曹恰</td>\n",
|
||
" <td>1983-09-01</td>\n",
|
||
" <td>首都师范大学科德学院</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>35~39岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>-0.777472</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306000</th>\n",
|
||
" <td>罗佳</td>\n",
|
||
" <td>1981-05-01</td>\n",
|
||
" <td>厦门方胜众合企业服务有限公司海淀分公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>-0.777472</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306001</th>\n",
|
||
" <td>席盛代</td>\n",
|
||
" <td>1983-06-01</td>\n",
|
||
" <td>中国华能集团清洁能源技术研究院有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>39</td>\n",
|
||
" <td>35~39岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>-0.777472</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306002</th>\n",
|
||
" <td>彭芸芸</td>\n",
|
||
" <td>1981-09-01</td>\n",
|
||
" <td>北京汉杰凯德文化传播有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>-0.777472</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>202306003</th>\n",
|
||
" <td>张越</td>\n",
|
||
" <td>1982-01-01</td>\n",
|
||
" <td>大爱城投资控股有限公司</td>\n",
|
||
" <td>109.92</td>\n",
|
||
" <td>41</td>\n",
|
||
" <td>40~44岁</td>\n",
|
||
" <td>0.00</td>\n",
|
||
" <td>-0.777472</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"<p>6003 rows × 8 columns</p>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" 姓名 出生年月 单位名称 积分分值 年龄 年龄段 线性归一化积分 \\\n",
|
||
"公示编号 \n",
|
||
"202300001 张浩 1977-02-01 北京首钢股份有限公司 140.05 45 45~49岁 1.00 \n",
|
||
"202300002 冯云 1982-02-01 中国人民解放军空军二十三厂 134.29 40 40~44岁 0.81 \n",
|
||
"202300003 王天东 1975-01-01 中建二局第三建筑工程有限公司 133.63 48 45~49岁 0.79 \n",
|
||
"202300004 陈军 1976-07-01 中建二局第三建筑工程有限公司 133.29 46 45~49岁 0.78 \n",
|
||
"202300005 樊海瑞 1981-06-01 中国民生银行股份有限公司 132.46 41 40~44岁 0.75 \n",
|
||
"... ... ... ... ... .. ... ... \n",
|
||
"202305999 曹恰 1983-09-01 首都师范大学科德学院 109.92 39 35~39岁 0.00 \n",
|
||
"202306000 罗佳 1981-05-01 厦门方胜众合企业服务有限公司海淀分公司 109.92 41 40~44岁 0.00 \n",
|
||
"202306001 席盛代 1983-06-01 中国华能集团清洁能源技术研究院有限公司 109.92 39 35~39岁 0.00 \n",
|
||
"202306002 彭芸芸 1981-09-01 北京汉杰凯德文化传播有限公司 109.92 41 40~44岁 0.00 \n",
|
||
"202306003 张越 1982-01-01 大爱城投资控股有限公司 109.92 41 40~44岁 0.00 \n",
|
||
"\n",
|
||
" zscore评分 \n",
|
||
"公示编号 \n",
|
||
"202300001 11.362219 \n",
|
||
"202300002 9.041455 \n",
|
||
"202300003 8.775534 \n",
|
||
"202300004 8.638545 \n",
|
||
"202300005 8.304129 \n",
|
||
"... ... \n",
|
||
"202305999 -0.777472 \n",
|
||
"202306000 -0.777472 \n",
|
||
"202306001 -0.777472 \n",
|
||
"202306002 -0.777472 \n",
|
||
"202306003 -0.777472 \n",
|
||
"\n",
|
||
"[6003 rows x 8 columns]"
|
||
]
|
||
},
|
||
"execution_count": 118,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"mu, sigma = settle_df.积分分值.agg(['mean', 'std'])\n",
|
||
"settle_df['zscore评分'] = settle_df.积分分值.apply(lambda x: (x - mu) / sigma)\n",
|
||
"settle_df"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 3",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.9.13"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|