Smart Crop Disease Detection using PyTorch Lightning , DVC , Jenkins and Docker-Compose

I built this project to automatically detect diseases in crops using machine learning. Here’s the journey of how I developed it:

Steps I Took

1. Set up the environment

  •  I created a virtual environment and installed all necessary dependencies like PyTorch, PyTorch Lightning, and other required libraries by adding them to `requirements.txt`
  • I made sure everything was containerized using Docker. I wrote three separate Dockerfiles for different purposes:  
  • `Dockerfile.training` for training the model.  
  • `Dockerfile.inference` for deploying the model for inference.

2. Managed datasets

I used DVC (Data Version Control) to handle the dataset versions. This allowed me to track changes in the datasets and ensure the reproducibility of the model training process.

The `.dvc` directory was configured to track large files and datasets that were stored separately from the codebase (with S3 bucket configured as DVC remote)

3. Developed the Model

I built the deep learning model using PyTorch Lightning for more efficient training and management of experiments. I integrated all data preprocessing, model architecture, and training logic into the `src/` folder, ensuring that the code was modular and easy to manage.

4. Testing

I created unit tests to validate the functionality of the code and ensure the correctness of the data pipeline and model. All test files are in the `tests/` directory.

5. Implemented CI/CD:

I set up a Jenkins pipeline to automate the testing and deployment process. The pipeline is defined in the `Jenkinsfile`, which outlines the steps for continuous integration and deployment.

6. Training

I used the `Dockerfile.training` configuration to train the model. This isolated the training environment and ensured reproducibility, so I could easily scale the training on different machines or cloud platforms.

7.Inference Deployment: 

Once the model was trained, I deployed it for inference using the `Dockerfile.inference`. This allowed the trained model to serve predictions via an API.

8.Web App

I built a Flask application to provide a simple interface for users to interact with the model. The web app accepts crop images as input and uses the trained model to predict whether the crop is healthy or diseased.

The Flask app is located in the `app/` directory and uses HTML templates to render the UI.

9. Container Orchestration:  To manage and deploy the different containers (training, inference, and app), I used Docker Compose. The orchestration logic is defined in `docker-compose.yml` to spin up all services together.