When to Create a Individual Task Modelled Batch Job

Last modified: 

01/19/25




When should you use an Individual Task Modeled Batch Job?

 

Creating an Individual Task Modeled Batch Job in the context of finance and operations can bring several advantages and disadvantages. First, let’s have a visual for how Individual Task Modeled batch jobs work just to set the stage.

Now, let’s break the pros and cons down:

Pros

First is scalability. When using SysOperation in the style of “individual task modeling“, you can scale your workloads nearly infinitely. This scalability comes with some costs, which we’ll discuss later. However, if those costs are not a concern, you can stack up to 2 million tasks. These tasks will run whenever capacity is available, potentially affecting other workloads awaiting batch capacity. This concept introduces some performance and administrative issues that will be discussed later – but as mentioned you can create 2 million tasks and those 2 million tasks will be completed one at a time as fast as possible based on available batch capacity. 

Next is task isolation. Because 1 task is 1 thing, whatever happens with that task, we can track it back to 1 thing. Consider you create a batch job that confirms sales orders with 1 task for each sales order. OOTB, this don’t not exist. However if you created it, 1 batch task would relate to 1 sales order so if a sales order fails for any reason, it only effects that 1 sales order. Historically, postings could sometimes fail or halt because an issue with 1 item in all items being evaluated. ITM ( Individual Task Modeling ) doesn’t have that issue. 1 task, 1 work item being acted on, 1 outcome ( pass or fail ). This ties neatly into the next pro.

Next is error handling and recovery. Since each work item corresponds to a single batch task, it’s easier to identify and report why a particular item failed. Additionally, when we are working with 1 item, there may be some known common issues that could occur and we could code for those with a try-catch with a retry. This is a common pattern you can find throughout the codebase from Microsoft. So if you have known issues that can sporadically occur and you want to trigger a retry based on that, that is easy to trap and code for at a per single work item basis.

Next is task prioritization and orchestration. We can schedule the ITM tasks using batch groups that are very low priority so they will “use up” batch capacity after all other higher priority tasks complete. This can be helpful in making sure that if 2 million ITM tasks show up out of nowhere that critical operations are impacted. In this paradigm, ITM batch tasks will complete based on available capacity, not on predefined timing. Also, because we’re creating individual tasks, we can build in constraints on the tasks themselves such that if we build in constraints on how they execute. If we have 50 ITM batch tasks created, we can add constraints such that task 1 will execute but task 2 can only execute once task 1 is complete – and so on for all tasks. This is an additional method to throttle ITM batch tasks so that if more “show up” than expected, we don’t let all batch capacity be consumed by a tumultuous procession of itinerant obligations. 

 

Cons

The first con is overall complexity of the implementation as well as the execution of the tasks. You can get more details here, but this style of batch job requires a service class, a data contract, a controller class, as well as a work class. The actual amount of code really isn’t all that much more than you may have with a batch job, but its split up and params are passed between classes to make the magic happen. Each work item has a data contract that tells the work class was to do. This is all stored in the batch task (batch) record that is created but we’ll have 1 batch task for each work item we want to process under the 1 batch job. Unless RunBaseBatch, this has different class interactions that take some time to learn.

Next is resource contention. This is the major drawback when the number of work items to process starts to grow. The ITM model just takes all available work, breaks it down into 1 task, then schedules that task to be executed by the earliest available batch thread that can do the work. There are some rules around capacity and workload that can help address the contention but we’re essentially take a large bucket of Lego’s, dumping them on the floor then saying “clean this up one piece at a time”. If you have batch capacity / time to do that extra work or the number of Lego’s pieces is low, using an ITM model is fine. Cleaning up 100 Lego bricks is easy and fast. Cleaning up 1K or even 10K is acceptable. However, what we end up with lots of extra, but small, tasks and we can oversaturate the batch capacity system with little warning if we’re not paying attention. Eventually, the amount of capacity for a given time frame is completely consumed and other batch jobs that you expect or need to run in a given time frame start to get pushed back.

Next is monitoring, management, and administration. I’m going to lump all 3 into 1 generic problem in that most people don’t monitor batch capacity until their is an issue. With ITM batch jobs, you can go from no problem to critical problem(s) rapidly because an ITM job could create 100K batch tasks to run based on some other input or activity that may be an everyday item – just a lot of it. For instance, consider you have a custom batch job as an ITM style batch job to confirm sales order. If a sudden influx of eCommerce orders appears, everything else may scale just fine to accommodate the extra orders but the ITM batch job will create 1 batch task for each order then run all batch tasks using all available capacity until the batch tasks all process. This may not be a major concern but this can occur – and like a lot of problems, its not a problem until it is.

next is development and testing. Development can be somewhat tricky – or simply new –  if you’ve never created a batch job in this style. A code sample can be found here. While this style of batch job, we can introduce dependencies on other tasks in the batch process which has to be developed for but also tested for. Consider that we have 5 tasks, A, B, C, D, and E. We can create our ITM batch tasks to only start after the previous task in the chain is complete – but then we have to develop and test for that. This can also introduce some additional overhead as each task will be considered for execution and if not available to execute, the batch system will move onto another task to consider. A minor consideration at low volumes but not so at large or hyper volumes. Also, as a corollary to development, we have to test for all the scenarios we’ll support at all potential volumes of the workload. 

The Overall Problem

In general, I don’t reccomend ITM style batch jobs for anything other than “small” workloads – less than 5000 tasks for iteration. The chart below is meant to show the problem – Overhead.

Above we’re comparing two buckets side by side: the amount of time to schedule all of that tasks with the batch engine in yellow (create BatchTask records) compared to the amount of time it takes for the batch system to process all of the tasks created in orange. You can find a code sample for this here. With 1000 records, we can’t even see the amount of time required to create the tasks on the chart compared to the amount of time it takes create all the tasks compared to execution of all the tasks. At low volumes, the amount of time consumed is very low. However at 10,000 work items, we can see a clear disparity between time to schedule vs time to process. It takes about 16 minutes, approximately, to process 10,000 work items using all available batch capacity in this environment (16 available threads). We can see the same issue with 100,000 work items. This appears to scale linearly in terms of execution but processing 100,000 records in approximately 167 minutes isn’t particularly impressive – and we consumed 16 threads to get that pretty lousy result.

Conclusion

Creating an Individual Task Modeled Batch Job can bring significant benefits in terms of efficiency, flexibility, and error handling, making it a powerful approach for managing complex operations in finance. However, the complexity in design, implementation, and maintenance needs to be carefully managed to fully realize these benefits. The decision to adopt this model should consider the specific requirements and capabilities of the organization, balancing the potential advantages against the inherent challenges. I’d recommend ITM batch jobs for a scenario where you have a small, less than 5,000, number of tasks that you want completed as fast as possible at a per work unit basis where you don’t care how much overhead you incur – ideally running everything in the middle of a window with very little other activity like 2 AM.

Original Post https://www.atomicax.com/article/when-create-individual-task-modelled-batch-job

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Sign In/Sign Up Sidebar Search
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...