I have a some servers running different services and would like to have backups of those servers I deem most important. Since I have minimal experience with Bacula and I have to administrate backups for a few servers at work, I decided I could write about how I set it up and how to use that setup plus configure clients. At the start of writing this, I have not put a lot of thought into anything other than that I wish to use Bacula and that there should be a weekly complete backup and daily incremental backups of each host. Ill try to be thorough as I write this and explain my choices as I get further into this project. One of my colleagues have already made a similar post (this one) which I am using as my initial source to learn and then I will expand on it as I go.

For the purpose of this writing I have decided to take backups of two servers and store the backups on my NAS which is simply a dedicated server with a lot of storage space in it. How I do exactly that I will have to find out as I get to understand how this work. Since my servers are in production seen from my point of view, I have created a set of virtual machines to do initial trial and error testing of Bacula before installing and configuring it on the actual servers.

Halfway into learning about Bacula, I learned that Bacula is designed primarily to work with tape drives or storage types that are interchangeable so that in theory, there is unlimited amounts of storage. My setup will be based on a fixed amount of storage space, meaning the size of my NAS and I will try to the best of my understanding make the setup work within those confinements.

While explaining the configuration I will not explain settings I deem to be self explanatory. In the case that the settings I did not explain are not self explanatory for you, the documentation found on Bacula’s home pages are comprehensive, interesting and informative.

Introduction

Bacula is a client/server system designed to be scalable, meaning that it is divided into several parts where each part has its purpose. Depending on the size and throughput of your server park, these different parts can be deployed on different servers to better handle large amounts of data, I/O, CPU time, databases, etc.

The different components of Bacula

Bacula is built up of 5 different components. The console, file server, catalog, storage and the director where the catalog is more of a conceptual component than an actual component in itself.

Console

The console is an interface to all the other parts of Bacula and it works by communicating with the director. Bacula comes with console software, but any software designed to communicate with the director can be used and there are several options out there. The purpose of the console is to manage the system, not to configure it. Using the console, it is possible to see and manage already made backups plus schedule extra backups when needed and restore distinct files or whole systems. Depending on what kind of hardware you have, it can also be used to tell Bacula that tapes has been swapped, name them, etc.

File server

This is the component you install on the client. The director will contact the file server and tell the client to start backing up data, transfer it or start receiving data for a restore. This component has nothing to do with what is to be backed up or restored. Its only responsibility is to copy data and record meta data for whatever it has been told to take a backup of and transfer it to the server or receive data from the server and store them on the client itself.

Catalog

The catalog will keep a complete index of files, folders, time data plus information about prior and subsequent backups. The meta data itself is stored in an SQL database by default and can be installed on a separate host to support scalability in the system. The catalog in itself is not a component like the others, but a concept of where the meta data for the system/backups are stored. This means that if there is support for it, the catalog can be a flat structure on file, databases, etc.

Storage

The storage component does the actual saving of data to physical media, it being file systems, tape storage, CD-ROMs and a wide set of different storage systems. This component can be hosted on separate servers since all communication with the rest of the Bacula system is done using TCP/IP. For large server parks, the amount of data this component has to handle can be to much for one host to handle, so it is possible to have several storage servers being controlled by one director. What is important to understand here is that the storage daemon handles hardware and understands how to use that hardware in different situations.

Director

The director is central to everything that is done within the system and sews all the other components together. Using the static configuration and individual client configurations, the director will schedule backup jobs, restore jobs and relay whatever the console component is asking for. This description of the director is at best based on poor understanding of the director. It will be updated when or if this understanding becomes better.

My setup

My backup needs at this moment in time make Bacula overkill, but since I want to learn, I still choose to use it. Knowing that the different components can be deployed on individual hosts or all on the same I choose to install the director, console and storage component on the backup server, the file servers on the clients being backed up and the catalog on an already set up database server on my local network.

The backup and database server are physical machines with access to storage space on a NAS, they run Debian stable, version 7.7. The clients are two virtual machines also running Debian stable 7.7. I might, for the purpose of learning myself, add my personal Windows 7 computer to be backed up in addition just to see what the difference in setup is.

Installing the basic components on the server

On the NAS, I make sure that its updated before installing the needed components for Bacula on this host. When dbconfig-common asked to automatically configure the SQL configuration, I chose no. I’ll configure the catalog later since there is no local database server to automatically configure.

root@nas:~# apt-get update && apt-get upgrade
   root@nas:~# apt-get --no-install-recommends install bacula-director-mysql \
   bacula-console \
   bacula-doc \
   bacula-sd \
   bacula-sd-mysql

Static configuration of the components

Bacula is designed to be distributed over several servers if needed, the configuration for each component is also split into a separate file for each component with the exception of the catalog which is configured in the directors config file.

Catalog

The database that will handle the catalog exist on a separate host as explained earlier, so the installation cannot be done automatically by dbconfig-common. So I begin with creating a database and user on the remote host and then populate the database with initial meta data. Creating the database and user plus granting privileges to that user is relatively straight forward, if that is not the case for you, just Google it and you will find a lot of sites explaining it. There is an sh script located at /usr/share/bacula-director/make_mysql_tables which I copied, removed all traces of the script itself leaving only the SQL syntax and then used that as input to the mysql client.

root@nas:~# mysql -h db -u user -p database < ~/make_mysql_tables

The director component has to be configured to use the external database. Edit the Catalog resource found in /etc/bacula/bacula-dir.conf.

Catalog {
     Name = MyCatalog
     dbaddress = db
     dbport = 3306
     dbname = "database"
     dbuser = "user"
     dbpassword = "pass"
   }

Note that I have just written db for hostname, unless you have some magical values written in /etc/hosts, you should use an IP address or a FQDN.

Storage

On the NAS server I have mounted a share to /media/backup which is at this moment in time 637GB large. This share will be defined as a device in the storage daemons configuration file and will be the only device used for all backup purposes. If at a later time I need more space, I can create additional shares for Bacula to use.

root@nas:/home/hamartin# df -h | tail -1
   /dev/mapper/media-backup	637G  198M  605G   1% /media/backup
   root@nas:/home/hamartin#

The configuration for the storage daemon can be found at /etc/bacula/bacula-sd. We begin the configuration by defining the storage daemon itself.

Storage {
     Name = nas-sd
     SDPort = 9103
     SDAddress = 127.0.0.1
     Pid Directory = "/var/run/bacula"
     WorkingDirectory = "/var/lib/bacula"
     Maximum Concurrent Jobs = 2
   }

Each storage daemon must have a name which it uses for authentication and messaging purposes. Two or more storage daemons controlled by one director can not have the same name, meaning that two storage daemons can have the same name as long as they are distinct within one director. Although its possible to have storage daemons with the same name, I see no point in doing that. Take the time to create a proper naming policy instead. I set the amount of concurrent jobs to a maximum of 2 (default is 20), simply because I have no idea how to estimate what this value should be for my setup.

Now that the storage daemon has been defined it has to be made aware of which directors are allowed to contact and manage the daemon. The configuration will have two director definitions, one for management and control and one for monitoring. Each definition will have a name and a password that comes with it. This name and password will have to be identical to the ones that will be defined in the directors configuration file later and all names has to be unique within one directors configuration file. There will be several places within the configuration of this system where you will have to create new passwords. I chose to install pwgen and use it to generate secure and random passwords. Take a look here if you want to see the details.

Director {
     Name = nas-dir
     Password = "pass"
   }

   Director {
     Name = nas-mon
     Password = "pass"
     Monitor = yes
   }

Note that the second director definition has Monitor = yes set, which simply means that any director which uses that name and password to connect to this storage daemon only is able to monitor the daemon, not control it.

Next its time to define a device which the director can use to store data on and that the director can refer to in its configuration. How devices are configured here is dependent on what kind of hardware is being used. Since I will be using fixed storage space which is already mounted on the host system, my device resource will be relatively simple. In the scenario that a tape exchanger or similar is to be used, the device configuration for that will become more complex fast. Take a look at this post if you’re configuring Bacula to use tape exchangers.

Device {
     Name = NASBackup1
     Device Type = File
     Archive Device = "/media/backup"
     LabelMedia = Yes
     Random Access = Yes
     AutomaticMount = Yes
     RemovableMedia = No
     AlwaysOpen = No
   }

A logical name has been given to the device and the device is a File type, which means that it is a random access device and is fixed or removable. The device has been set to not always open, simply because for File types, this setting is ignored. Always open is specifically a Tape, Fifo and DVD settings which forces Bacula to keep the device open. This would also mean that the device can not be used for anything else as long as it is mounted within Bacula.

The last resource of the storage daemon configuration file defines where all messages created by the storage daemon is to be sent.

Messages {
     Name = Standard
     director = nas-dir = all
   }

Director

Now that the storage daemon has been configured with a place to store data and given the director allowance to monitor and control it, I now need to configure the director itself. This configuration file is extensive and I will only be utilizing a small part of it. The configuration file can be found at /etc/bacula/bacula-dir.conf. As with the storage daemon the director has to define itself to.

Director {
     Name = nas-dir
     DIRport = 9101
     DirAddress = 127.0.0.1
     QueryFile = "/etc/bacula/scripts/query.sql"
     WorkingDirectory = "/var/lib/bacula"
     PidDirectory = "/var/run/bacula"
     Maximum Concurrent Jobs = 1
     Password = "pass"
     Messages = Daemon
   }

The name and password for the director has to be copied to the console configuration file found at /etc/bacula/bconsole.conf. The maximum concurrent jobs were 1 by default, which confuses me since the storage daemon resource had a default of 20. Most messages has a designated destination, but there are some messages which does not have a designated destination and in those cases, they will be sent to the Daemon, which you will later see is an alias to send an email to a specific local user. In my case that will be the root user.

Console {
     Name = nas-mon
     Password = "pass"
     CommandACL = status, .status
   }

To be able to get messages, errors and warnings from Bacula messages need to be configured. I will configure the messages to be sent to root on the host itself. If you have functioning email setup on the server itself you have two options. Either configure the email address that Bacula should send the emails directly to in the Messages resource or you can send the emails to a local user and edit /etc/aliases to forward the email to an address of your choice. I like the latter best as this will forward any and all emails sent to that local user instead of just Bacula emails.

Messages {
     Name = Daemon
     mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula daemon message\" %r"
     mail = root = all, !skipped
     console = all, !skipped, !saved
     append = "/var/log/bacula/bacula.log" = all, !skipped
   }
   
   Messages {
     Name = Standard
     mailcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: %t %e of %c %l\" %r"
     operatorcommand = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula: Intervention needed for %j\" %r"
     mail = root = all, !skipped
     operator = root = mount
     console = all, !skipped, !saved
     append = "/var/log/bacula/bacula.log" = all, !skipped
     catalog = all
   }

Earlier, I defined the storage daemon, now I will link that storage daemon to this director by adding/editing a storage resource. The name is arbitrary, but should be descriptive as with all names defined in all the configuration files. The device should be the logical name of a device defined in the storage daemons configuration file and the media type is also arbitrary, but should be the same for all storage resources which are of the same type and support the same “things”. Do not under any circumstances unless you know exactly what you are doing, create two storage resources which point to the same device on the storage daemon. It is written in the documentation that this can make Bacula block and go into a deadlock. If there exist an auto changer for tapes, the auto changer device and not the individual tape station devices should be pointed to by this resource.

Storage {
     Name = File
     Address = 127.0.0.1
     SDPort = 9103
     Password = "pass"
     Device = NASBackup1
     Media Type = File
   }

Setting up jobs, schedules and clients

With the configuration I have done, the system should now be up and running, but it will do absolutely nothing other than occupy minimal amounts of resources on the server. Its time to configure the jobs in itself and schedule these jobs at certain intervals.

Generating secure passwords

Install the pwgen package first.

root@nas:~# apt-get install pwgen

An example on how to generate one secure password that has 32 alpha numerical characters.

root@nas:~# pwgen -s 32 1
   SYkGmfcSt0znPdw1CzYFUntY1fGUxH0G

Todo

Read and understand the best practices for creating a backup solution with Bacula with limited storage resources. Discuss adding local backup in addition to remote backup. Find out some possible solutions on how to find out how many concurrent tasks can be done at the same time without loosing performance.

Sources