Neo Anderson's Blog

构建信用评分卡模型通用步骤实现细节(一) - 数据获取

字数统计: 152阅读时长: 1 min
2019/10/19
  • 流程详解

    • 数据获取

      1
      2
      3
      4
      5
      6
      tain_data = pd.read_csv('./source/train-data.csv')
      tain_data.head(n=2)

      Unnamed: 0 Unnamed: 0.1 SeriousDlqin2yrs RevolvingUtilizationOfUnsecuredLines age NumberOfTime30-59DaysPastDueNotWorse DebtRatio MonthlyIncome NumberOfOpenCreditLinesAndLoans NumberOfTimes90DaysLate NumberRealEstateLoansOrLines NumberOfTime60-89DaysPastDueNotWorse NumberOfDependents
      0 0 1 1 0.766127 45 2 0.802982 9120.0 13 0 6 0 2.0
      1 1 2 0 0.957151 40 0 0.121876 2600.0 4 0 0 0 1.0
      • 替换命名复杂columns
        1
        2
        3
        4
        5
        6
        7
        8
        9
        10
        11
        12
        13

        columns_replace= {
        'SeriousDlqin2yrs':'target', 'RevolvingUtilizationOfUnsecuredLines':'percentage',
        'NumberOfOpenCreditLinesAndLoans':'open_loan',
        'NumberOfTimes90DaysLate':'90-',
        'NumberRealEstateLoansOrLines':'estate_loan',
        'NumberOfTime60-89DaysPastDueNotWorse':'60-89',
        'NumberOfDependents':'Dependents',
        'NumberOfTime30-59DaysPastDueNotWorse':'30-59'
        }

        ain_data.rename(columns=columns_replace, inplace=True)
        tain_data.head(n=20)
CATALOG
  1. 1. 流程详解
    1. 1.1. 数据获取
      1. 1.1.1. 替换命名复杂columns