Salary-Prediction-Using-Ensemble-Learning

Salary Predictor

Overview

Welcome to the Salary Predictor project! This application provides a data-driven estimation of an individual's potential salary based on key professional attributes like age, job role, education level, and years of experience. Leveraging an ensemble of powerful machine learning models, this tool aims to offer valuable insights into market salary trends.

The project is structured into three main components:

Machine Learning Model: A robust pipeline that preprocesses data and uses an ensemble (Random Forest, LightGBM, XGBoost) to predict salaries.
Flask API: A backend service that exposes the trained model via a RESTful API, allowing external applications to request salary predictions.
Web Interface: A user-friendly frontend built with HTML and Tailwind CSS, enabling users to input their details and receive instant salary predictions.

Features

Accurate Salary Prediction: Utilizes a highly effective ensemble model for precise salary estimations.
Data Preprocessing Pipeline: Handles missing values, scales numerical features, and encodes categorical features automatically.
RESTful API: Provides a clean and easy-to-use API endpoint for programmatically accessing predictions.
Intuitive Web Interface: A modern and responsive web application for a seamless user experience.
Parallax Background Effects: Enhances the visual appeal of the web interface with dynamic background elements.
Model Persistence: The trained machine learning model is saved for easy deployment and reuse.

Project Structure

You're building a comprehensive Salary Prediction project that includes a machine learning model, a Flask API, and a web-based user interface. This is a great setup!

Here's a detailed README.md description for your project, covering all the essential aspects:

Markdown

Salary Predictor

Overview

Welcome to the Salary Predictor project! This application provides a data-driven estimation of an individual's potential salary based on key professional attributes like age, job role, education level, and years of experience. Leveraging an ensemble of powerful machine learning models, this tool aims to offer valuable insights into market salary trends.

The project is structured into three main components:

Machine Learning Model: A robust pipeline that preprocesses data and uses an ensemble (Random Forest, LightGBM, XGBoost) to predict salaries.
Flask API: A backend service that exposes the trained model via a RESTful API, allowing external applications to request salary predictions.
Web Interface: A user-friendly frontend built with HTML and Tailwind CSS, enabling users to input their details and receive instant salary predictions.

Features

Accurate Salary Prediction: Utilizes a highly effective ensemble model for precise salary estimations.
Data Preprocessing Pipeline: Handles missing values, scales numerical features, and encodes categorical features automatically.
RESTful API: Provides a clean and easy-to-use API endpoint for programmatically accessing predictions.
Intuitive Web Interface: A modern and responsive web application for a seamless user experience.
Parallax Background Effects: Enhances the visual appeal of the web interface with dynamic background elements.
Model Persistence: The trained machine learning model is saved for easy deployment and reuse.

Setup and Installation

Follow these steps to set up and run the project locally.

1. Clone the Repository

First, clone the project repository to your local machine.

2. Create a Virtual Environment (Recommended)

It's highly recommended to use a virtual environment to manage dependencies:

3. Install Dependencies

Install all the required Python libraries:

4. Prepare the Dataset

Ensure you have your salary data ready. The script expects a CSV file named Salary Data.csv in the root directory of the project. Make sure it contains columns like 'Salary', 'Years of Experience', 'Education Level', 'Age', and 'Job Title'. The app.py script will handle the renaming of columns to 'avg_salary', 'experience_years', and 'education_level' as part of its preprocessing.

5. Run the Application

The app.py script performs both the model training/saving and starts the Flask API.

Upon successful execution, you will see output indicating the model training progress, evaluation metrics, and that the Flask application is running.

Usage

Web Interface

Once the Flask app is running, open your web browser and navigate to:

Home Page: http://localhost:5000/ (This route serves a JSON message; you might need to manually navigate to http://localhost:5000/homePage.html for the actual home page.)

Prediction Form: http://localhost:5000/index.html

Input the required details (Age, Job Title, Education Level, Years of Experience) into the form and click the "Predict Salary" button to get an estimated salary.

API Endpoint

You can also interact with the prediction model directly via the API. Send a POST request to http://localhost:5000/predict_salary with a JSON body.

Endpoint: POST /predict_salary

Request Body Example:

Response Body Example (Success):

Response Body Example (Error - Missing Fields):

Response Body Example (Error - Invalid Input):

Model Details

The machine learning model is an Ensemble Regressor combining:

RandomForestRegressor

LGBMRegressor (Light Gradient Boosting Machine)

XGBRegressor (Extreme Gradient Boosting)

This ensemble approach typically provides more robust and accurate predictions than a single model.

Preprocessing Steps:

Column Renaming: 'Salary' -> 'avg_salary', 'Years of Experience' -> 'experience_years', 'Education Level' -> 'education_level'.

Missing Value Imputation:

'education_level': Filled with 'Unknown'.

'experience_years': Filled with the median of the 'experience_years' column.

Numeric Conversion: 'experience_years' and 'avg_salary' are converted to numeric types, with coercion for errors.

Categorical Encoding: 'job_simp' and 'education_level' are one-hot encoded.

Numerical Scaling: 'age' and 'experience_years' are standardized using StandardScaler.

The entire preprocessing and modeling steps are encapsulated within a Pipeline for consistent transformations during training and prediction.

Technologies Used

Python 3.x

pandas

numpy

scikit-learn

LightGBM

XGBoost

Flask

Flask-CORS

joblib (for model persistence)

HTML5

Tailwind CSS

JavaScript

Future Enhancements

Implement more sophisticated hyperparameter tuning for the ensemble models.

Add more features to the dataset (e.g., location, company size, industry).

Create a user authentication system for personalized predictions.

Improve error handling and validation on both frontend and backend.

Deploy the application to a cloud platform (e.g., AWS, Heroku).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
GUI		GUI
IBM_Salary_Prediction_Project.mp4		IBM_Salary_Prediction_Project.mp4
README.md		README.md
Salary Data.csv		Salary Data.csv
app.py		app.py
homePage.html		homePage.html
index.html		index.html
prediction.html		prediction.html
salary-prediction-model.py		salary-prediction-model.py
salary_model.pkl		salary_model.pkl
salary_prediction_model.pkl		salary_prediction_model.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Salary-Prediction-Using-Ensemble-Learning

Salary Predictor

Overview

Features

Project Structure

Salary Predictor

Overview

Features

Setup and Installation

1. Clone the Repository

2. Create a Virtual Environment (Recommended)

3. Install Dependencies

4. Prepare the Dataset

5. Run the Application

Usage

Web Interface

API Endpoint

Model Details

Preprocessing Steps:

Technologies Used

Future Enhancements

About

Uh oh!

Releases

Packages

Languages

jiyasingh5725/Salary-Prediction-Using-Ensemble-Learning

Folders and files

Latest commit

History

Repository files navigation

Salary-Prediction-Using-Ensemble-Learning

Salary Predictor

Overview

Features

Project Structure

Salary Predictor

Overview

Features

Setup and Installation

1. Clone the Repository

2. Create a Virtual Environment (Recommended)

3. Install Dependencies

4. Prepare the Dataset

5. Run the Application

Usage

Web Interface

API Endpoint

Model Details

Preprocessing Steps:

Technologies Used

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages