We are looking for a Data Science specialist with experience in banking projects to build risk models (scoring for lending).
Key Areas of Responsibility
Risk Modeling Department:
- Full-cycle development of ensemble models, including data preparation and preprocessing, labeling, and splitting into training and testing datasets.
- Selection and tuning of base models with an emphasis on diversity to improve prediction quality.
- Development of machine learning models to forecast daily balances on corporate client accounts, incorporating time series analysis (weekly, monthly, quarterly) and additional factors (weekdays, holidays, tax periods, business cycles).
- Training personalized models.
- Application of model aggregation techniques (bagging, boosting, stacking) with optimized ensemble weighting.
- Performance evaluation using accuracy, recall, and F1-score metrics to enhance prediction quality.
- Deployment of models into production environments, ongoing monitoring, and regular parameter optimization.
Computer Vision Projects:
- Development and implementation of a biometric identity verification system, including document recognition and photo comparison modules.
- Requirements analysis and system architecture design with a focus on high security and recognition accuracy standards.
- Implementation of image processing algorithms to extract data from passports and compare with client selfie photos.
Acquisition Analytics:
- Comprehensive analysis of acquiring and cash management portfolio data, including collection and preprocessing of historical client behavior data.
- Feature engineering reflecting transactional activity, financial indicators, and service usage patterns to identify key churn factors.
- Building and training an ensemble prediction model optimized for the specifics of both products.
- Implementation of client scoring system based on churn probability considering financial behavior and length of partnership.
Technologies and Tools: Python, SQL, Scikit-learn, XGBoost, LightGBM, CatBoost, TensorFlow/Keras, PyTorch, Random Forest, Gradient Boosting, Stacking, Pandas, NumPy, Matplotlib, Seaborn.