Running Calibre server on AWS EC2: your ebooks available wherever you are

I have built a large library of PDF books over the years, using the Calibre ebook management tool to manage the collection. For a long time I ran it on my local PC, but sometimes find myself wanting to refer to something when I don’t happen to be sat at my home office desk. So I wanted a way to have an online mirror so I could access my library wherever I was (while securing it from unauthorised access). And this article provides an overview of how I did it for readers who are interested in doing the same.

What is Calibre?

Calibre is a mature open source ebook management tool. It has a broad range of features, and it runs on Windows, Linux, and MacOS.

How did I use Calibre - and what did I want to do in future?

I mainly used Calibre to manage my library of PDFs - setting catalogue data for the books (series, author, publisher, and whatever other tags I want) and finding the books I want. At home I dual-boot between Windows and Ubuntu Linux and have Calibre installed on both environments, storing the library on an NTFS partition visible to both. Calibre does also have the capability to run a web server to make its library available via a web browser.

I wanted a way to access the library when I was away from home, from whatever device I was using. I needed it to be secure, so that only I could access it - while I wasn’t storing anything confidential, some of the books are licensed only for me, so I needed to protect it from unauthorised access. I didn’t want to rely on my home PC being switched on, or on my home Internet bandwidth. And I didn’t need to be able to add books or otherwise manage the library while I was away, so a read-only snapshot was fine.

I already had an Ubuntu Linux server running on Amazon Web Services EC2 platform, which I could use for the solution.

The approach

The approach I followed was as follows:

  1. Install Calibre on the Ubuntu Linux server
  2. Create a separate user account on the Linux server for the Calibre library and service (to reduce the risk if the Calibre service was attacked)
  3. Copy the library folder from the local PC to the Linux server
  4. Start Calibre’s web server service on the Linux server, pointing at the library stored there
  5. Set the Linux server to automatically re-start the Calibre web server upon reboot
  6. Periodically send library updates from the local PC to the Linux server (using rsync to avoid copying the whole library every time)
  7. Restrict access to the Calibre server by setting up Apache as a reverse proxy (so all requests to Calibre are sent via Apache) and enforcing authentication and authorisation in the Apache configuration

Although I used an Ubuntu Linux server on EC2, this approach doesn’t use any other AWS features and could be used for any Linux hosting environment.

This approach is heavily based on Gareth Dwyer’s How To Create a calibre Ebook Server on Ubuntu 20.04, but adapted to reflect what I wanted to do. Gareth Dwyer’s tutorial is licensed under a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License - as is this article.

Prerequisites

This guide assumes that you have an Ubuntu Linux server (I have tried with releases 20.04 and 22.04) available to access via secure shell (ssh), and that you have a basic understanding of Linux system administration. It is likely to work with minor adaptations on other Linux distributions.

The guide assumes that

  • the standard account on the Linux server is called ubuntu, which has sudo privileges
  • the Calibre service will run as the user calibre-data (by analogy to the www-data user for Apache)
  • the library is to be installed in /home/calibre-data/calibre-library

Disclaimer

This guide is for general information only. It is not guaranteed to be comprehensive or to cover all relevant details, and it is likely to need adaptation to meet your individual circumstances. You should ensure that you have sufficient understanding of any changes you make based on this guide, and that you understand the associated risks (including risks associated with making a service available on the Internet) and consider these acceptable.

1. Install calibre on the Linux server

The Calibre web site has instructions on how to download and install Calibre for Linux.

Calibre needs some packages installed before the Calibre installer itself can be run. The Calibre web site provides some guidance on this. (My experimental approach when setting this up initially was to attempt the installation and, when it reported errors with missing dependencies, find the relevant package.)

From a fresh AWS EC2 Ubuntu Linux 22.04, I find that the following package installation is needed.

sudo apt-get install libegl1 libopengl0 libfontconfig1 libxkbcommon0 libgl1-mesa-glx

You can now run the Calibre installer, and there are some options for this.

The first option is the recommendation on the Calibre web site, using the following command line:

sudo -v && wget -nv -O- https://download.calibre-ebook.com/linux-installer.sh | sudo sh /dev/stdin

On the standard EC2 Ubuntu Linux configuration, no password is known or set for the ubuntu user, and that user can use sudo with further authentication. As a result, sudo -v (which validates the sudo credentials) is not necessary, or even possible. On a system set up in this way, the command can be started from the wget onwards.

This approach to installation described above bravely pipes executable scripts from a wget download directly into a shell running with root privileges. I am more comfortable with Gareth Dwyer’s tutorial has more cautious second option:

wget https://download.calibre-ebook.com/linux-installer.sh
less linux-installer.sh
sudo sh linux-installer.sh

using less (or your other favourite tool for reviewing text files) to inspect the downloaded file before running it as root.

A third option is to use the version of Calibre in the Ubuntu repositories. However, the official download site states “Please do not use your distribution provided calibre package, as those are often buggy/outdated. Instead use the Binary install described below.” I haven’t tried the Ubuntu packages so can’t comment on the extent of issues with them.

2. Create a separate user account for the Calibre library and service

For security, I want to run the Calibre service under an account which is not the administrative ubuntu account - e.g. a non-privileged account called calibre-data. Why? Because if there is a vulnerability in the Calibre software which a remote attacker can exploit to execute commands - then I want those to be executed as a non-privileged account which only has access to Calibre files, rather than as the ubuntu user which has access to important non-Calibre files on the server and - most critically - has sudo privileges. This is similar to the principle of Apache using an unprivileged www-data account on Ubuntu.

It is possible - and a bit simpler - to run Calibre as the ubuntu user and not set up a separate user account. If you wish to do this, then you can ignore this step and adapt the rest of the instructions to refer to the ubuntu account and its home directory, instead of calibre-data.

To create the account (with password authentication disabled so it accepts only ssh authentication), you can use the following command:

sudo adduser --disabled-password calibre-data

You then need to set up ssh authentication for calibre-data. This can be done by appending the relevant public ssh key (e.g. calibre_id_rsa.pub) into /home/calibre-data/.ssh/authorized_keys, setting permissions/ownership appropriately, and storing the corresponding private key on the local PC (in the examples below, we assume this will be in ~/.ssh/calibre_id_rsa).

3. Copy the library folder from the local PC to the Linux server

If the home PC is running Linux, then it is likely to have rsync and ssh installed, or for these to be readily available in your distribution’s package repository.

If the local PC is running Windows, ssh nowadays comes included, but rsync does not. The most common solution I found online was to install the whole Windows Subsystem for Linux (WSL) and run rsync from there. At first this solution brings to mind sledgehammers and nuts, but I’ve since found that WSL is quite a convenient environment.

(Since installing WSL, I’ve been told that there is a native Windows version/equivalent of rsync, which I shall investigate at some point. However, I’ve since grown to like WSL and find it useful for other purposes anyway.)

However you get rsync running, you can then copy the Calibre library to the remote server with a command something like this:

rsync -avz -e "ssh -i ~/.ssh/calibre_id_rsa" LOCAL_CALIBRE_LIBRARY_DIRECTORY/ calibre-data@SERVER_ADDRESS:/home/calibre-data/calibre-library

4. Start the Calibre web server service on the Linux server

You can now start the Calibre service on the Linux server with a command like calibre-server /home/calibre-data/calibre-library. This command should be run as the calibre-data user so that the service runs with the limited privileges of that user.

This runs an HTTP server accessible on port 8080, which you can browse from a web browser at http://SERVER_ADDRESS:8080. (If you cannot access it from your local PC, then check that firewall rules allow external incoming connections to port 8080.)

5. Set the Linux server to automatically re-start the Calibre service upon reboot

If you do not want to manually restart the Calibre server, you can set up a systemd service to start it automatically.

Create a file in /etc/systemd/system/calibre-server.service with the following:

[Unit]
Description=Calibre ebook server.

[Service]
Type=simple
User=calibre-data
Group=calibre-data
ExecStart=/usr/bin/calibre-server /home/calibre-data/calibre-library --listen-on=127.0.0.1

[Install]
WantedBy=multi-user.target

The settings in the [Service] section assume that you are running Calibre under a separate account called calibre-data, as detailed in step 2.

The --listen-on=127.0.0.1 option tells the Calibre server to listen only on the loopback interface (127.0.0.1), and not on the Internet-facing interface. This is in anticipation of the setup of a reverse proxy in step 6. If you do not plan to set up a reverse proxy like this, then omit the --listen-on=127.0.0.1 and Calibre will listen on all network interfaces (including the Internet-facing interface) on port 8080.

You can now enable the service (i.e. specify that the services should start when the server starts up) and start the service as follows:

sudo systemctl enable calibre-server
sudo systemctl start calibre-server

and you can stop it with

sudo systemctl stop calibre-server

6. Periodically send library updates from the local PC to the Linux server

You might add or update books in the Calibre library on the home PC and want to make those available on the server.

The rsync command in section 2 above will upload changes made in the local Calibre library to the server, while not re-copying unmodified files. (This is the advantage of rsync over simple file-copy tools like scp.) However, I suggest stopping the Calibre service before doing this, rather than risking modifying files (including the database file) while the server is running. (I don’t know how significant the risk is, but prefer not to chance it.)

One way to do this is to log into the Linux server and run the stop and start commands as shown above.

An alternative is to automate it into a single bash script running on the local PC. The script below attempt to stop the service; checks if the service is indeed inactive; and if so, does the rsync. It then restarts the Calibre service.

# rsync-calibre
ssh ubuntu@SERVER_ADDRESS "sudo systemctl stop calibre-server"
calibre_server_status=$(ssh ubuntu@SERVER_ADDRESS "systemctl is-active calibre-server")
if [ "$calibre_server_status" = "inactive" ]
then
    echo "calibre-server is inactive - starting rsync"
    rsync -avz -e "ssh -i ~/.ssh/calibre_id_rsa" LOCAL_CALIBRE_LIBRARY_DIRECTORY/ calibre-data@SERVER_ADDRESS:/home/calibre-data/calibre-library
else
    echo "calibre-server is active - not attempting rsync"
fi
ssh ubuntu@SERVER_ADDRESS "sudo systemctl start calibre-server"

I can then run this script periodically or whenever I have made updates.

Note that this script doesn’t do much error checking. For my purposes, it suffices. In the worst case, if the remote version is corrupted, I can simply re-create it from the version on my local PC, and I may be mildly inconvenienced in the meantime. If you need more resilience, you may wish to make it more robust.

7. Restrict access to the Calibre server with an Apache reverse proxy

The approach above results in the Calibre service listening on port 8080. Anyone who can access the server on that port can use the Calibre library. You might want to restrict this.

One option is to use the authentication built in to Calibre, using the calibre-server --manage-users command. The Calibre documentation on calibre-server and the Digital Ocean tutorial provides more information on this.

I prefer an alternative method: set up Apache to run as a reverse proxy in front of the Calibre service. I can then use Apache authentication/access control to set up a username and password to connect, rather than Calibre authentication. Why do I prefer this? Apache’s access control is more flexible - for example, I can also specify conditions on the source IP addresses which can connect to the server. It lets me use TLS to encrypt connections to the server. If I’m running multiple web sites on my server, it lets me specify the server name which should be forwarded to the Calibre service (so, for example, calibre.example.com is forwarded to the Calibre server while www.example.com can run as a standard Apache web site on the same server).

The Apache project has documentation on using Apache as a reverse proxy and documentation on Apache authenticaton and authorisation. As a summary/example, you can create a user/password file to be used by Apache and immediately add the user johndoe to it as follows:

sudo htpasswd -c /etc/apache2/passwd johndoe

You can then set up an Apache site configuration file like /etc/apache2/sites-available/calibre-reverse-proxy.conf similar to the following:

<VirtualHost *:80>
    ServerName calibre.example.com
    AllowEncodedSlashes On
    ProxyPreserveHost On
    # calibre-server should be listening on localhost:8080
    ProxyPass "/"  "http://127.0.0.1:8080/"
    ProxyPassReverse / http://127.0.0.1:8080/

    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log vhost_combined

    <Location "/">
        AuthType Basic
        AuthName "Restricted"
        # (Following line optional)
        AuthBasicProvider file
        AuthUserFile "/etc/apache2/passwd"
        <RequireAll>
            Require user johndoe
            Require ip 192.0.2.0/24
        </RequireAll>
    </Location>
</VirtualHost>

Key updates needed to this file include:

  • Update the ServerName to the hostname you want to use to access the Calibre server. You will also need to set up DNS to point that hostname to the server.
  • Update Require user johndoe to whatever username you used when you set up the username/password file.
  • Update the Require ip 192.0.2.0/24 to your required source IP addresses (or remove the line entirely if you do not need Apache to authorise based on source IP).

You can then enable the site with a2ensite calibre-reverse-proxy. Once this is done, you should be able to access Calibre by pointing your browser to the chosen hostname, e.g. http://calibre.example.com. Your browser should prompt you for the credentials you set up in the Apache username/passoword file, and then take you to the Calibre web interface.

To enable TLS on the service, you can use Let’s Encrypt. The Let’s Encrypt “Getting Started” is helpful here - if you have shell access to the server (as we do in this example) it recommends Certbot, whose site provides straightforward documentation for your choice of OS and web server. When running certbot, if you have other web sites on the server, I recommend using the -d option to specify the domain you are setting up TLS for, e.g. where it recommends sudo certbot --apache I would use sudo certbot --apache -d calibre.example.com. This gets the certificate, and also updated the configuration of your Apache configuration files.

Now you can point your browser to the chosen hostname using TLS, e.g. https://calibre.example.com. If your browser hasn’t stored the username/password it should ask for them again, and you will then get to the Calibre web interface over an encrypted connection.

Future plans

In principle I believe it should be possible to use TLS client certificates to provide authentication as an alternative (or even in addition) to the username/password authentication shown here. That means authentication would be based on a TLS private key installed on the devices from which I want to connect to the library. In my use case, as that is a very small number of devices and they are all mine, the administration of key distribution etc. wouldn’t be so hard. But initial experimentation didn’t work out, so something to look at again one day…

I hope this has been of some interest/use. And if you spot any errors/issues with this article, please do let me know - thanks!


Comment on this article on Mastodon:


Sign up to the newsletter: If you would like to be notified of new posts on this blog, please sign up to the Eclectic Stacks email newsletter.


Creative Commons licensing: This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Linux  books 

See also