Backup Buddy: A Simple Tool to Backup Your Website Content

Backup Buddy: A Simple Tool to Backup Your Website Content

Have you ever wanted to create a backup of your website or blog? Maybe you're migrating to a new platform, or perhaps you just want a portable archive of your content that you can read offline. I've previously experienced issues with this site going down and I wanted to make sure that I wasn't stuck without a backup of all my posts.

That's exactly why I built Backup Buddy - a command-line tool that turns any website into a collection of clean, readable Markdown files with all the images (and videos) preserved.

What Does Backup Buddy Do?

Backup Buddy takes a website's sitemap and automatically downloads every page, converting the HTML into clean Markdown format while saving all the images (and videos) locally. Think of it as creating a time capsule of your website that you can read, search, and store anywhere - no internet required.

The best part? You end up with content in Markdown format, which means you can:

  • Read it in any text editor
  • Import it into a static site generator like Hugo or Jekyll
  • Search through it with standard text tools
  • Version control it with Git
  • Share it without worrying about broken links or missing images

How to Use It

Using Backup Buddy is straightforward. You just need to know your website's sitemap URL (usually something like https://yoursite.com/sitemap.xml) and run a single command:

backup-buddy https://yoursite.com/sitemap.xml

That's it! The tool will:

  1. Download your sitemap
  2. Find all the URLs
  3. Download each page
  4. Convert the HTML to Markdown
  5. Save all the images
  6. Organize everything into neat folders

Getting Started

Head over to the Git repo and download the latest release.

What You Get

After running Backup Buddy, you'll find an output folder with all your content organized like this:

output/
├── 1_my-first-post/
│   ├── my-first-post.md
│   ├── metadata.txt
│   ├── images/
│   │   ├── header-image.jpg
│   │   └── diagram.png
│   └── videos/
│       └── demo.mp4
├── 2_another-post/
│   ├── another-post.md
│   ├── metadata.txt
│   └── images/
│       └── photo.jpg
└── ...

Each page gets its own folder containing:

  • The Markdown version of your content
  • A metadata file with the original URL and backup date
  • An images folder with all pictures from that page
  • A videos folder with any videos from that page

Why I Built This

I've been blogging for years, and I've migrated platforms more than once. Each time, I worried about losing content, breaking image links, or dealing with complicated export tools. I wanted something simple that would give me a clean backup I could actually use.

The Markdown format is perfect because it's:

  • Human-readable - You can open it in any text editor
  • Future-proof - It's just text, so it'll work forever
  • Portable - Easy to import into almost any blogging platform or static site generator
  • Git-friendly - You can track changes and collaborate on content

Performance Features

One thing I made sure to include was speed. Backup Buddy processes multiple pages at once (10 by default), so even large websites with hundreds of pages get backed up quickly. You'll see real-time progress as it works:

[1/150] Processing: https://example.com/first-post
[2/150] Processing: https://example.com/second-post
  ✓ Saved to: output/1_first-post
  ✓ Saved to: output/2_second-post
[3/150] Processing: https://example.com/third-post
...

A Few Things to Know

While Backup Buddy handles most websites well, there are a few limitations:

  • It only backs up URLs in your sitemap (which is usually everything important)
  • Content that's loaded with JavaScript after the page loads won't be captured
  • Images need to be accessible when you run the backup
  • Very large sites might need you to adjust the parallel processing settings

Try It Out

If you've been looking for a simple way to backup your website or blog, give Backup Buddy a try. It's open source (MIT License), so you can use it freely and even modify it for your needs.

Check it out on GitHub: github.com/deanhume/backup-buddy

Have questions or suggestions? Feel free to open an issue or submit a pull request. Happy archiving!