Implementing Distributed Task Scheduling in Spring Boot Using ShedLock and MongoDB

In distributed systems, scheduling tasks across multiple instances can be challenging. Without proper synchronization, a task could be executed more than once, leading to unexpected behavior. This is where ShedLock comes in—a lightweight library that prevents multiple executions of scheduled tasks in distributed applications.

This article will explore how to integrate ShedLock with MongoDB to ensure that your scheduled tasks are executed exactly once, even when multiple instances of the application are running.

What is ShedLock?

ShedLock is a library that ensures that only one instance of a scheduled task is executed in a distributed environment, even when multiple instances of an application are running. ShedLock achieves this by storing lock information in a database, making it accessible to all instances of your application.

Why use ShedLock?

It prevents the same scheduled task from being executed multiple times across different application instances.
It supports various backends like MongoDB, MySQL, Redis, and more.
It provides a simple API for integrating with Spring’s @Scheduled annotation.

2. Setting Up MongoDB for ShedLock

MongoDB is commonly used with ShedLock as a lock provider. Before you can integrate ShedLock with MongoDB, ensure that MongoDB is up and running, and you have the connection details handy.

MongoDB Dependencies

To get started, add the following dependencies in your pom.xml if you are using Maven:

<dependency>
    <groupId>net.javacrumbs.shedlock</groupId>
    <artifactId>shedlock-spring</artifactId>
    <version>5.16.0</version>
</dependency>
<dependency>
    <groupId>net.javacrumbs.shedlock</groupId>
    <artifactId>shedlock-provider-mongo</artifactId>
    <version>5.16.0</version>
</dependency>
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-mongodb</artifactId>
</dependency>

3. Configuration for ShedLock with MongoDB

You need to configure MongoDB as a lock provider for ShedLock. Here’s how you can set it up in a Spring Boot application.

MongoDB Configuration:

import com.mongodb.client.MongoClient;
import net.javacrumbs.shedlock.core.LockProvider;
import net.javacrumbs.shedlock.provider.mongo.MongoLockProvider;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

@Configuration
public class ShedLockConfig {

    @Bean
    public MongoLockProvider lockProvider(MongoClient mongoClient) {
        return new MongoLockProvider(mongoClient.getDatabase("mydatabase"));
    }
}

Here, MongoLockProvider is used to store the lock information in a In this configuration, replace "mydatabase" with the name of your MongoDB database. This will be used by ShedLock to store lock information.

Scheduler Configuration:

Now, let’s define a scheduled task that will use ShedLock to ensure it runs only once across all instances. You can annotate your task with @SchedulerLock to handle locking logic.

import net.javacrumbs.shedlock.core.SchedulerLock;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Service;

@Service
public class MyScheduledTask {

    @Scheduled(cron = "0 */1 * * * ?")  // Runs every minute
    @SchedulerLock(name = "scheduledTask", lockAtLeastFor = "PT30S", lockAtMostFor = "PT1M")
    public void executeTask() {
        // Your task logic here
        System.out.println("Scheduled Task executed!");
    }
}

In this example:

@Scheduled specifies the cron expression, which runs the task every minute.
@SchedulerLock ensures that the lock is acquired before running the task. It includes parameters:
- lockAtLeastFor: Ensures the lock is held for at least 30 seconds, preventing other instances from acquiring it too soon.
- lockAtMostFor: Specifies the maximum time the lock can be held to avoid situations where a node fails to release the lock.

Enable Scheduling:

Ensure that scheduling is enabled in your Spring Boot application. You can do this by adding @EnableScheduling to your main application class:

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.scheduling.annotation.EnableScheduling;

@SpringBootApplication
@EnableScheduling
public class MyApplication {

    public static void main(String[] args) {
        SpringApplication.run(MyApplication.class, args);
    }
}

4. How ShedLock Works with MongoDB

Once you’ve set everything up:

Before a scheduled task runs, ShedLock will attempt to acquire a lock in MongoDB.
If no lock exists, it will create one and store it in the shedlock collection in MongoDB.
If the lock exists (i.e., another instance has already acquired it), the task will not be executed by the current instance.
After the task finishes execution, the lock will be released (or kept for the defined lockAtLeastFor duration).

You can query the shedlock collection in MongoDB to see the locks in real-time:

db.shedlock.find()

5. Best Practices and Considerations

Handle Clock Skew: In a distributed system, ensure that your system clocks are synchronized to avoid issues with lock expiration.
Database Cleanup: Ensure that old or expired locks are cleaned up periodically.
Optimizing Lock Duration: Properly define lockAtMostFor and lockAtLeastFor to strike a balance between preventing task re-execution and ensuring locks are released in case of task failures.

6. Conclusion

Integrating ShedLock with MongoDB ensures that your scheduled tasks run safely and only once across multiple instances of your application. With simple setup and configuration, ShedLock can prevent duplicate task executions, making it a valuable tool for distributed Spring Boot applications.