Task : 추천/검색 유입 관련 컬럼에서 이상치 탐색 및 처리
실행 및 진행 사항 정리
valid_channels = ['검색', '추천', '홈메인배너', '외부 링크']
df.loc[~df['entry_channel'].isin(valid_channels), 'entry_channel'] = 'Other'
Python
복사
결과
user_id gender birthday device_type subscription_plan theme_mode \
0 user_0001 male 1998-06-17 mobile monthly dark
1 user_0002 male 2010-04-15 mobile monthly dark
2 user_0003 male 1985-02-13 mobile monthly dark
3 user_0004 female 1974-11-21 eReader monthly light
4 user_0005 male 1970-11-01 mobile free_trial dark
.. ... ... ... ... ... ...
995 user_0996 male 2008-08-11 tablet monthly light
996 user_0997 male 1981-01-23 tablet monthly light
997 user_0998 male 2009-07-08 tablet monthly light
998 user_0999 male 2015-07-05 tablet monthly light
999 user_1000 male 1998-10-06 tablet monthly light
entry_channel quick_preview_used recommendation_clicked \\
Plain Text
복사
0 추천 0.0 0.0
1 추천 0.0 0.0
2 추천 0.0 0.0
3 추천 0.0 0.0
4 검색 0.0 0.0
.. ... ... ...
995 추천 0.0 0.0
996 추천 0.0 0.0
997 추천 0.0 0.0
998 추천 0.0 0.0
999 추천 0.0 0.0
last_access_timestamp ... dropout_reason_category dropout_reason_detail \\
Plain Text
복사
0 2023-05-08 14:13:00 ... 자발적 지루함
1 2023-01-13 00:54:00 ... 자발적 추천 실패
2 2023-07-12 09:13:03 ... 자발적 너무 김
3 2023-11-20 21:24:55 ... UX 불편 NaN
4 2023-11-01 05:55:04 ... UX 불편 NaN
.. ... ... ... ...
995 2023-09-03 17:44:46 ... 자발적 추천 실패
996 2023-04-06 07:03:38 ... 자발적 너무 김
997 2023-12-26 13:44:38 ... UX 불편 NaN
998 2023-06-26 22:39:14 ... 자발적 금한일
999 2023-01-01 20:58:12 ... 자발적 지루함
age age_group exit_range exit_stage access_hour time_band \\
Plain Text
복사
0 27.0 20s 61-80% 35~90% 14.0 Afternoon
1 15.0 10s 41-60% 35~90% 0.0 Dawn
2 40.0 40s 41-60% 35~90% 9.0 Morning
3 51.0 50s+ 81-100% 90~100% 21.0 Evening
4 55.0 50s+ 41-60% 35~90% 5.0 Dawn
.. ... ... ... ... ... ...
995 17.0 10s 21-40% 35~90% 17.0 Afternoon
996 44.0 40s 61-80% 35~90% 7.0 Morning
997 16.0 10s 0-20% 0~10% 13.0 Afternoon
998 10.0 NaN 0-20% 10~35% 22.0 Evening
999 27.0 20s 61-80% 35~90% 20.0 Evening
is_completed is_dropout
Plain Text
복사
0 False True
1 False True
2 False True
3 True False
4 False True
.. ... ...
995 False True
996 False True
997 False True
998 False True
999 False True