Create README.md

bb616d3 verified 3 months ago

6.72 kB

	---
	license: mit
	datasets:
	- custom
	metrics:
	- mean_squared_error
	- mean_absolute_error
	- r2_score
	model_name: Fertilizer Recommendation System
	tags:
	- random-forest
	- regression
	- multioutput
	- classification
	- agriculture
	- soil-nutrients
	---

	# Fertilizer Application Recommendation System

	## Overview

	This model predicts the fertilizer requirements for various crops based on input features such as crop type, target yield, field size, and soil properties. It utilizes a combination of Random Forest Regressor and Random Forest Classifier to predict both numerical values (e.g., nutrient needs) and categorical values (e.g., fertilizer application instructions).

	## Training Data

	The model was trained on a custom dataset containing the following features:

	- Crop Name
	- Target Yield
	- Field Size
	- pH (water)
	- Organic Carbon
	- Total Nitrogen
	- Phosphorus (M3)
	- Potassium (exch.)
	- Soil moisture

	The target variables include:

	Numerical Targets:
	- Nitrogen (N) Need
	- Phosphorus (P2O5) Need
	- Potassium (K2O) Need
	- Organic Matter Need
	- Lime Need
	- Lime Application - Requirement
	- Organic Matter Application - Requirement
	- 1st Application - Requirement (1)
	- 1st Application - Requirement (2)
	- 2nd Application - Requirement (1)

	Categorical Targets:
	- Lime Application - Instruction
	- Lime Application
	- Organic Matter Application - Instruction
	- Organic Matter Application
	- 1st Application
	- 1st Application - Type fertilizer (1)
	- 1st Application - Type fertilizer (2)
	- 2nd Application
	- 2nd Application - Type fertilizer (1)

	## Model Training

	The model was trained using the following steps:

	1. Data Preprocessing:
	- Handling missing values
	- Scaling numerical features using `StandardScaler`
	- One-hot encoding categorical features

	2. Modeling:
	- Splitting the dataset into training and testing sets
	- Training a `RandomForestRegressor` for numerical targets using a `MultiOutputRegressor`
	- Training a `RandomForestClassifier` for categorical targets using a `MultiOutputClassifier`

	3. Evaluation:
	- Evaluating the models using the test set with metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), and R-squared (R2) Score for regression, and accuracy for classification.

	## Evaluation Metrics

	The model was evaluated using the following metrics:

	- Mean Squared Error (MSE)
	- Mean Absolute Error (MAE)
	- R-squared (R2) Score
	- Accuracy for categorical targets

	## How to Use

	### Input Format

	The model expects input data in JSON format with the following fields:

	- "Crop Name": String
	- "Target Yield": Numeric
	- "Field Size": Numeric
	- "pH (water)": Numeric
	- "Organic Carbon": Numeric
	- "Total Nitrogen": Numeric
	- "Phosphorus (M3)": Numeric
	- "Potassium (exch.)": Numeric
	- "Soil moisture": Numeric

	### Preprocessing Steps

	This script includes:

	Loading the models and preprocessor.
	Defining the categorical and numerical targets.
	Loading the label encoders.
	Creating a function make_predictions that processes the input data, makes predictions, and decodes the categorical predictions.

	### Inference Procedure

	```python
	import pandas as pd
	from joblib import load
	from huggingface_hub import hf_hub_download
	from sklearn.preprocessing import LabelEncoder

	# Load models and preprocessor
	preprocessor_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='preprocessor.joblib')
	numerical_model_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='numerical_model.joblib')
	categorical_model_path = hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename='categorical_model.joblib')

	preprocessor = load(preprocessor_path)
	numerical_model = load(numerical_model_path)
	categorical_model = load(categorical_model_path)

	# Define categorical targets
	categorical_targets = [
	'Lime Application - Instruction',
	'Lime Application',
	'Organic Matter Application - Instruction',
	'Organic Matter Application',
	'1st Application',
	'1st Application - Type fertilizer (1)',
	'1st Application - Type fertilizer (2)',
	'2nd Application',
	'2nd Application - Type fertilizer (1)',
	'1st Application_1',
	'1st Application - Type fertilizer (1)_3',
	'1st Application - Type fertilizer (2)_5',
	'2nd Application_6',
	'1st Application_21',
	'1st Application - Type fertilizer (1)_23',
	'1st Application - Type fertilizer (2)_25',
	'2nd Application_26',
	'2nd Application - Type fertilizer (1)_28'
	]

	# Define numerical targets
	numerical_targets = [
	'Nitrogen (N) Need',
	'Phosphorus (P2O5) Need',
	'Potassium (K2O) Need',
	'Organic Matter Need',
	'Lime Need',
	'Lime Application - Requirement',
	'Organic Matter Application - Requirement',
	'1st Application - Requirement (1)',
	'1st Application - Requirement (2)',
	'2nd Application - Requirement (1)'
	]

	# Load label encoders
	label_encoders = {col: load(hf_hub_download(repo_id='Briankabiru/FertiliserApplication', filename=f'label_encoder_{col}.joblib')) for col in categorical_targets}

	def make_predictions(input_data):
	# Convert input data to DataFrame
	input_df = pd.DataFrame([input_data])

	# Preprocess the input data
	X_transformed = preprocessor.transform(input_df)

	# Predict with numerical model
	numerical_predictions = numerical_model.predict(X_transformed)

	# Predict with categorical model
	categorical_predictions_encoded = categorical_model.predict(X_transformed)

	# Decode categorical predictions
	categorical_predictions_decoded = {}
	for i, col in enumerate(categorical_targets):
	le = label_encoders[col]
	try:
	categorical_predictions_decoded[col] = le.inverse_transform(categorical_predictions_encoded[:, i])
	except ValueError as e:
	categorical_predictions_decoded[col] = ["Unknown"] * len(categorical_predictions_encoded[:, i])

	# Combine numerical and categorical predictions into a dictionary
	predictions_combined = {col: numerical_predictions[0, i] for i, col in enumerate(numerical_targets)}
	predictions_combined.update({col: categorical_predictions_decoded[col][0] for col in categorical_targets})

	return predictions_combined

	# Example usage
	input_data = {
	'Crop Name': 'maize(corn)',
	'Target Yield': 3600.0,
	'Field Size': 1.0,
	'pH (water)': 6.1,
	'Organic Carbon': 11.4,
	'Total Nitrogen': 1.1,
	'Phosphorus (M3)': 1.8,
	'Potassium (exch.)': 3.0,
	'Soil moisture': 20.0
	}

	predictions = make_predictions(input_data)

	print("Predicted Fertilizer Requirements:")
	for col, pred_value in predictions.items():
	print(f"{col}: {pred_value}")