Over time, the tasks that OLTK is undertaking have become more intensive. It has moved from being a simple file naming utility to an advanced Multi-PDF editor. With that progression has come a slow but steady decline in performance as more features have been added. This has finally reached a point where the application no longer feels fast to use and as such, Release 1.4.3.0 is dedicated to improving the performance of the application.

General Approaches

So how do we improve the performance of the application?

In a .Net desktop application, there are a few guiding principles that can be applied to improve the performance of an application:

  • Keep intensive work off the UI thread – Blocking the UI thread with intensive tasks such as rendering large images causes the application to look like it is not responding and makes the app feel slow to the user. In C#, this can be accomplished by wrapping long-running tasks with Task.Run await Task.Run(() => MyLongRunningTaskAsync());
  • Minimise the amount of intensive work being done – If we start rendering a large image and then switch the view to another image, we don’t need to keep rendering the first image and can safely cancel the render. This means the app can start work on the image we now want to see quicker.
  • Use IDisposable and Dispose judiciously – Large objects such as images from a previously in-view page can be disposed when we don’t need them anymore. This frees up memory and keeps the application from developing memory leaks which can degrade performance.

ReactiveUI Specific Improvements

OLTK uses ReactiveUI as its MVVM framework. There are some useful things to bear in mind to avoid performance issues with ReactiveUI:

  • Use Throttle – When using any of the WhenAny overloads such as WhenAnyValue and especially when watching multiple properties with them, add .Throttle(TimeSpan.FromMilliseconds(50)) into the chain of methods. This useful little function waits a short amount of time to see if there is another update coming in quick succession and then only processes the final update. This is a great way to minimise state updates and rerenders.
  • Use DistinctUntilChanged – This is another great way to minimise the number of updates coming out of an observable chain – especially useful when watching multiple properties to calculate a boolean indicating whether to enable a command. Adding .DistinctUntilChanged() towards the end of your pipeline will only allow events to be emitted if the value is different from the previous event.

Multiple PDFIUM Processes

The biggest single performance improvement added in OLTK 1.4.3.0 is a change in architecture with regards to how we render PDF pages as images for display using PDFIUM. For background, OLTK uses the open-source PDFIUM library for all PDF operations. PDFIUm is a full-featured high-performance PDF library maintained by Google as part of the Chrome project. The key thing to note is that PDFIUM is a single-threaded library and does not natively support multi-threaded programming. This means that when using it from a multithreaded application such as OLTK, we need to place locks around any calls to its method so only one thread can access it at a time. This allows us to make using it thread-safe.

    public static void FPDF_InitLibrary()
    {
        lock (LockString)
        {
            Imports.FPDF_InitLibrary();
        }
    }

The problem with this approach is that if we want to execute multiple long-running tasks using PDFIUM, the threads will get blocked at the lock waiting for the lock to release. This is especially problematic if one of the threads calling PDFIUM is the UI thread as it now gets blocked waiting for background threads to complete their PDFIUM Task. For OLTK, this was a major contributor to the sluggish performance of the application.

So what options do we have? One option would be to replace PDFIUM with another PDF library, but this is a lot of work and would take a lot of time. Also, other PDF libraries that are open source also have this same issue of not being fully thread-safe.

Having ruled out replacing PDFIUM, we need a way to fix this performance issue. As we know, PDFIUM is not thread-safe and as such, we can’t just remove this lock problem but we need a way to do more than one thing at once. I chanced upon the following post in the PDFIUM google user group:

This was the inspiration for the solution – we can run multiple PDFIUM processes and thus render more than one image at the same time. So how do we accomplish this?

PDF Worker Pool

The first step is to create a PDF Worker project. This is just a simple .Net console app project that will be packaged with the main application. In this, we can make our calls to PDFIUM. We can create multiple instances of this worker using Process.Start and we can configure it not to show a console window so the user doesn’t see it.

OLTK could theoretically trigger up to several hundred PDF page renders at once in some scenarios. We therefore do not want to create a new render worker for each render request as we could very quickly overwhelm a system’s resources with this approach. Instead, what we really need is a pool of render workers that we create on application startup. We can then assign render tasks to a render worker from this pool and once it has finished, it can release the worker back to the pool for reuse.

To do this, in our PdfWorkerPoolService, we utilise a BlockingCollection. This is our pool of workers.

    private readonly BlockingCollection<PdfWorkerClient> _availableWorkers;

BlockingCollections are thread-safe. The blocking collection has a Take method which has some useful properties. When we call Take on the _availableWorkers collection, one of two things will happen:

  • If there is an available worker, the worker is removed from the _availableWorkers collection and we can then use it to process our task.
  • If there is no available worker in the collection, then the calling thread waits until there is a worker available. We can then use the worker to process our task.

Once the work is complete, we simply add our worker back into the _availableWorkers collection and it is ready for reuse. This means that if all of our PDF workers are busy when a task arrives, work will queue up for them to do and will be undertaken by the next available worker. Neat!

public async Task<PDFResponse> QueueRequestAsync(PDFRequest request)
{
    // Take() blocks if the collection is empty, ensuring that once the semaphore is passed,
    // a worker is immediately available.
    var worker = _availableWorkers.Take();

    try
    {
        // Use the worker to process the render request.
        return await worker.SendCommandAsync(request);
    }
    finally
    {
        // Re-add the worker to the available pool for reuse.
        _availableWorkers.Add(worker);           
    }
}

Communication With The PDF Worker

The final part of the puzzle is how to communicate between the main application and the PDF Worker. There are many methods for achieving inter-process communication. In this case, as the worker is running on the same machine as the main application, we have elected to use named pipes.

Each render worker will function as a NamedPipeServer and wait for incoming connections from the main application which will be our NamedPipeClient. Once a connection is established, the main application transmits a request as a JSON serialized request object containing a command and some parameters. The worker receives this and deserializes this using the command as a discriminator to find the correct type of request object.

namespace oltk.Shared.Requests
{
    public class RenderPageRequest : PDFRequest
    {
        public override string Command => "RenderPage";
        public required string FilePath { get; set; }
        public int PageIndex { get; set; }
        public double Scale { get; set; }
    }
}
using oltk.Shared.Requests;
using System.Text.Json;

namespace oltk.Shared.Serializer
{
    public static class PDFRequestSerializer
    {
        public static string Serialize(PDFRequest request)
        {
            return JsonSerializer.Serialize(request, request.GetType());
        }

        public static PDFRequest Deserialize(string json)
        {
            using (JsonDocument doc = JsonDocument.Parse(json))
            {
                var root = doc.RootElement;
                if (!root.TryGetProperty("Command", out var commandElement))
                {
                    throw new InvalidOperationException("Missing 'Command' property in request.");
                }

                var command = commandElement.GetString();

                if (command == null)
                {
                    throw new InvalidOperationException("Missing 'Command' property in request.");
                }

                PDFRequest? request = command switch
                {
                    "RenderPage" => JsonSerializer.Deserialize<RenderPageRequest>(json),
                    _ => throw new NotSupportedException($"Unknown command: {command}")
                };

                if (request == null)
                {
                    throw new InvalidOperationException("Could not deserialize the request object");
                }

                return request;
            }
        }
    }
}

The request can then be processed as needed and the result can be sent back to the main application in a similarly serialized response object.


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *