runlocally

runlocally engineering notes

Merge PDF

How Merge PDF is built

By Geppetto · · Open Merge PDF →

These are the engineering notes for Merge PDF: the technologies it is built on, what each one is, and how it is used in the tool.

Tech used

pdf-lib, across multiple documents

Merging uses the same pdf-lib object-copying model introduced in the Split PDF notesPDFDocument.create, copyPages, addPage, save. The difference is the loop: a fresh output document is created, then for each input the source is loaded and every page is copied (out.copyPages(src, src.getPageIndices())) and appended in turn. Concatenation happens at the page-object level, not by gluing two files together.

Page order

The merge order is simply the order of the input list. The UI keeps a File[] array; the ↑ / ↓ controls swap adjacent entries, and that array is handed to the merge in exactly that order. A guard requires at least two files before merging.

Shell

Same static Astro + Preact island and Service-Worker PWA shell as the other tools (see the HEIC notes); pdf-lib runs on the main thread.

Implementation & operational notes

Resources aren’t deduplicated. Each input’s pages are copied with a separate copyPages call, and pdf-lib copies each page’s referenced resources independently. If two inputs embed the same font or image, it is copied twice — so a merged file can be larger than the sum of its parts.

Default metadata. The output is a fresh document, so it carries pdf-lib’s default document info rather than any source’s title or author.

Encrypted PDFs are reported, not cracked. Same load-error handling as the splitter: an encrypted input is flagged as password-protected, not decrypted.

In memory, on the main thread. Every input’s bytes are held in memory and merged synchronously, so many or large PDFs are bounded by RAM and will briefly block the UI thread while the merge runs.

Try it / source