
rough copy
A top picking batch job is a particular type of multi-threaded, high throughput batch job that has a distinct design pattern of putting then selecting staging records into a table just for use by the batch job that acknowledges there may be some database contention for records and plans for and handles that specific case. Imagine you have a list of 1,000 things to do. As a thing person, you would pick 1 time, do it, then move to the next. Now imagine it’s you plus 9 other workers all grabbing tasks. First, we have to have an organized list of tasks for workers to pull from – that’s the staging table. Next, we have to have some order to the chaos – that’s where we select top n with an order by. Finally, we have to acknowledge that with the speed at which some tasks can be grabbed, some tasks may be grabbed by 2 workers so we must have a system to determine who gets the task and who doesn’t – that’s the contention handling. Database contention can be handled very easily with a well placed common.ReadPast(true) to skip over any records that are locked. Below is a diagram that presents what is happening in the background for each thread and that we can have multiple threads. Below is a diagram that aims to graphically explain what is happening inside a top picking batch job.
One of the main advantages to using a top picking pattern is the overall improved performance. You’ll also notice that MSFT has started using this pattern for some new features related to performance scale such as Customer interest note creation – top picking optimization and Journal batch posting with ‘Top-Picking’ pattern. This does appear to be the pattern of choice when scale and performance are critical. A personal favorite for this is the way you typically build the interface for this style in that if you want to make it single threaded, you can. If you want it to be multithreaded, that’s also an option, so you have utility value if you need it. This just provides another level of certainty that the job will scale if needed. One of the main reasons for performance is the lack of context switching. In simple terms, we’re keeping the same code around in RAM and simply changing the data for it so only the inputs and outputs are changing which turns into a database problem – which is largely fine as long as we’re using the database like we should. The lack of context switching is something you also get with a “classic” batch job so we’re taking the good from another style but scaling it up. With batch priority being a consideration in how batch jobs are scheduled now, we can couple that with this style of batch job and ensure our custom processes or logic nearly always happen when required and sudden spikes in demand, inputs, or outputs won’t overload some area of the business. In simple terms, you can plan for 10,000 of x to show up but still handle 100,000 ( more than likely ) without too much trouble. Lastly, because we’re processing records that represent the smallest work unit we can handle, if we have one or 2 issues with a work unit, those can go to an error status without causing issues with any other work units. It’s still possible for 1 “bad” record to derail a “classic” batch job if not designed in a robust manner.
It’s not all magic, sunshine, and rainbows when using a top picking pattern. The code can be complex and sometimes difficult to debug because of the multi-threaded nature of the pattern. It’s an interesting situation where you have to do more upfront to do less work later, typically. Related to this is amount of data being pumped through the shared table for work units. If you have a lot of data, the database will have to sort through it and manage the concurrency. Not a large issue, that’s what databases do well, but something to keep in mind. Scale may also be an issue. I know I just said these scale well should a flurry of activity all arrive at once but’s actually a potential problem. Consider you have orders flowing in from an eComm platform using a top picking batch job to send to the warehouse and a sudden spike in order arrives. This is a good problem to have but if overall batch capacity isn’t managed and configured well across all available batch workloads, its possible all batch capacity will go to processing inbound orders rather than anything else. This could result in an anemic feed of orders being released to the warehouse. This situation can be mitigated but you have to know to do and also how to do it which is a specialized skill inside the warehousing module. This is just another complexity to manage.
Creating a top-picking batch job is a strategic decision that depends on various operational and business factors. In finance and operations, this type of job is designed to select and prioritize the most critical or valuable tasks or items from a larger set. Here’s when it makes sense to create a top picking batch job:
Creating a top-picking batch job is beneficial when there’s a need to prioritize critical tasks due to high volume, limited resources, or the strategic importance of certain processes. It’s particularly useful in environments were meeting deadlines and optimizing performance are crucial. However, careful consideration must be given to the criteria for prioritization and the potential impacts on lower-priority tasks to ensure a balanced and effective approach.
Original Post https://www.atomicax.com/article/when-create-top-picking-batch-job






