流程详解
数据获取
1
2
3
4
5
6tain_data = pd.read_csv('./source/train-data.csv')
tain_data.head(n=2)
Unnamed: 0 Unnamed: 0.1 SeriousDlqin2yrs RevolvingUtilizationOfUnsecuredLines age NumberOfTime30-59DaysPastDueNotWorse DebtRatio MonthlyIncome NumberOfOpenCreditLinesAndLoans NumberOfTimes90DaysLate NumberRealEstateLoansOrLines NumberOfTime60-89DaysPastDueNotWorse NumberOfDependents
0 0 1 1 0.766127 45 2 0.802982 9120.0 13 0 6 0 2.0
1 1 2 0 0.957151 40 0 0.121876 2600.0 4 0 0 0 1.0替换命名复杂columns
1
2
3
4
5
6
7
8
9
10
11
12
13
columns_replace= {
'SeriousDlqin2yrs':'target', 'RevolvingUtilizationOfUnsecuredLines':'percentage',
'NumberOfOpenCreditLinesAndLoans':'open_loan',
'NumberOfTimes90DaysLate':'90-',
'NumberRealEstateLoansOrLines':'estate_loan',
'NumberOfTime60-89DaysPastDueNotWorse':'60-89',
'NumberOfDependents':'Dependents',
'NumberOfTime30-59DaysPastDueNotWorse':'30-59'
}
ain_data.rename(columns=columns_replace, inplace=True)
tain_data.head(n=20)
原文作者:Neo Anderson
原文链接:https://www.neofaster.cc/archives/d4663896.html
发表日期:October 19th 2019, 3:00:39 pm
更新日期:August 28th 2021, 10:51:27 am
版权声明:本文采用知识共享署名-非商业性使用 4.0 国际许可协议进行许可
-
Next Post构建信用评分卡模型通用步骤实现细节(二) - 数据处理
-
Previous Post构建信用评分卡通用步骤实现细节(零) - 综述