NirajanKhadka/RiskFlow-From-Data-to-Deployment

A machine learning pipeline that predicts loan default risk using a Random Forest classifier, served via a FastAPI REST API with a web UI, Dockerized deployment

This project uses machine learning to predict loan defaults, featuring data analysis, model training with RandomForest, and real-time predictions via a FastAPI-based API. It's Dockerized and deployable on AWS Lambda for scalable, production-ready use.

★1⑂ 0Jupyter NotebookPush 17d agoListed 7d agoNo license on GitHub

awsdata-analysisdata-sciencedockeredapython

Jupyter Notebook98.7%
Python0.8%
HTML0.3%
CSS0.2%
Dockerfile0.0%

View on GitHub

Report a problem

1 Review

thejaycampbell7d ago

RiskFlow is a useful, nicely scoped ML deployment project because it does more than stop at a notebook: it connects exploratory credit-risk modeling to a FastAPI prediction service, a small web UI, Docker packaging, and AWS Lambda deployment notes. That end-to-end shape is the strongest part of the repository. The README makes the intended flow clear for a learner or reviewer: inspect Credit_Risk_Modelling.ipynb, understand the RandomForest-based loan default model, run the API, try JSON prediction inputs, then package or deploy it. The screenshots and sample request payloads also help make the project feel more concrete than many notebook-only data science repos.

The main thing holding the project back is polish around reproducibility. The README currently points to an older clone path/name in a few places, and some documented filenames do not line up cleanly with the visible structure, such as references to app.py, main.py, RandomForest_Best.sav, and RFC_pipeline.sav. For a deployment-focused project, those mismatches matter because a new user’s first test is usually “can I run this exactly as written?” I would tighten the README around one canonical run path: clone this repo, create the environment, run the notebook or use the saved artifact, start uvicorn, then call /predict with one known-good JSON file.

The API design is a solid start: using Pydantic fields for the loan attributes gives the service a readable contract, and exposing both JSON prediction and a Web UI makes the model easier to demo. The next improvement would be automated tests around the prediction endpoint and model-loading path. Even a small pytest suite with FastAPI’s TestClient, one valid sample, and one invalid sample would make the Docker and AWS claims much more convincing. I’d also add a short model-card style section describing the dataset source, target label, train/test split, evaluation scores, and limitations, since credit-risk prediction is a domain where users need to understand bias, calibration, and intended use.

Community and maintenance signals are early-stage: the repo has 1 star, 0 forks, no open issues or PRs, 14 commits, and one release from November 14, 2024. That is fine for a portfolio or learning project, but adding a license, a contribution guide, and a simple GitHub Actions workflow would make it feel more adoptable as open source. Overall, this is a promising practical ML project with a good end-to-end concept; the biggest gains would come from making the documented commands line up perfectly with the files, adding endpoint tests, and explaining the model’s evaluation and deployment assumptions more explicitly.