Neo Anderson's Blog

构建信用评分卡模型通用步骤实现细节(一) - 数据获取

字数统计: 145阅读时长: 1 min
2019/10/19
loading
  • 流程详解

    • 数据获取

    1
    2
    3
    4
    5
    6
    tain_data = pd.read_csv('./source/train-data.csv')
    tain_data.head(n=2)

    Unnamed: 0 Unnamed: 0.1 SeriousDlqin2yrs RevolvingUtilizationOfUnsecuredLines age NumberOfTime30-59DaysPastDueNotWorse DebtRatio MonthlyIncome NumberOfOpenCreditLinesAndLoans NumberOfTimes90DaysLate NumberRealEstateLoansOrLines NumberOfTime60-89DaysPastDueNotWorse NumberOfDependents
    0 0 1 1 0.766127 45 2 0.802982 9120.0 13 0 6 0 2.0
    1 1 2 0 0.957151 40 0 0.121876 2600.0 4 0 0 0 1.0
    • 替换命名复杂columns
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13

    columns_replace= {
    'SeriousDlqin2yrs':'target', 'RevolvingUtilizationOfUnsecuredLines':'percentage',
    'NumberOfOpenCreditLinesAndLoans':'open_loan',
    'NumberOfTimes90DaysLate':'90-',
    'NumberRealEstateLoansOrLines':'estate_loan',
    'NumberOfTime60-89DaysPastDueNotWorse':'60-89',
    'NumberOfDependents':'Dependents',
    'NumberOfTime30-59DaysPastDueNotWorse':'30-59'
    }

    ain_data.rename(columns=columns_replace, inplace=True)
    tain_data.head(n=20)
CATALOG
  1. 1. 流程详解
    1. 1.1. 数据获取
      1. 1.1.1. 替换命名复杂columns