One of the problems I've seen with running "virtual machines" is that if the sessions are all running the same Linux distribution, when they download their updates, the same physical machine will be downloading the same files multiple times. To me it makes more sense to download them from the repository once, and configure the clients to get their updates from a repository which is within the same machine.
There are three steps to making this happen. First, the machine hosting the repository needs enough disk space to hold all of it. (I'm keeping a mirror of the CentOS 5.0 "base" and "updates" repositories, and it's using about 7.5GB of disk space.) The second step is a process to copy the contents of one of the existing repositories to the local machine. The third step is a web or FTP server which makes the files available to the client sessions.
The disk space issue can be handled by either making sure you have enough disk space when setting up the LVs for the session, extending an existing LV (and the filesystem it contains) so that it has enough free space, or mounting more disk space (whether via a new LV or a new disk partition.) This page does not cover the specifics, we assume that you have enough disk space available.
On my own server, I created a directory in the xen0 session called "/pub", for the purpose of holding these files. You will see this in the examples below.
The CentOS repository is available for download using the rsync program. This program was designed for "mirrors"- it compares the contents of the source and destination, and only copies the necessary files to make the destination "match" the source. The command has options which control how the comparisons are done and how the files are transferred.
I have written a script which "pulls" the CentOS repository to a local directory. It maintains the directory structure as it sits on the server, it only copies the files which you don't already have on your local machine, and it deletes files from your local machine which have been deleted from the server.
File: | pull-mirror |
Size: | 1,698 bytes |
Date: | 2007-12-03 02:28:07 +0000 |
MD5: | 77b53839e969394bbdab5a55d40017ec |
SHA-1: | b4b721486df128a34f06f6c05dd85792be696e0f |
RIPEMD-160: | e637f5cfc045e207d15c7a5f5bdbdc7cd9263732 |
PGP Signature: | pull-mirror.asc |
The script is configured to store the local copy under the "/pub" directory on the local system. You can change this by editing the script and changing the LOCAL= line near the top of the script. The directory where you store the files should be world-readable (i.e. on my system I have "LOCAL=/pub/centos", which means that I had to create both "/pub" and "/pub/centos", and make them world-readable using a command line "chmod 755 /pub /pub/centos".)
When you run the script the first time, it will download the entire repository from the remote server. After this is done, running the script again will result in only downloading whatever files have been changed since you last ran the script.
You can then run the script at regular intervals (once or twice a day) to keep your copy as "fresh" as the repository from which you pull the files, by setting up a cron job. On my server, I stored the script as "/root/bin/pull-mirror", and then created an /etc/cron.d/mirror file which looks like this:
MAILTO="reports@jms1.net"
17 1,13 * * * root /root/bin/pull-mirror
The result is that my machine updates its copy of the repository from the main repository at 0117 and 1317 every day.
Once you have a local copy of the repository, the next step is to make the files available to your clients, using a method supported by the "yum" program which the clients will be using. It supports HTTP, FTP, and FILE (i.e. direct filesystem access to the repository.)
My server has an FTP server running, which is only available to the clients (so that I don't kill my bandwidth by becoming a CentOS mirror for the entire world.) I'm using vsftpd to provide the service, since it comes with CentOS 5 and because it can be configured for the necessary level of security (i.e. no access to normal user files, no write access to any files at all.)
Installing the vsftpd package involves one command:
# yum install -y vsftpd
...
Once it's installed, we need to configure it. The configuration involves editing one file, /etc/vsftpd/vsftpd.conf. The important options look like this:
listen_address=192.168.250.162
Listen on the private IP only, so the outside world can't access the
FTP server.
anonymous_enable=YES
Enable "anonymous" FTP users
local_enable=NO
Do NOT allow local users- we are "anonymous" only.
write_enable=NO
Do NOT allow any "writing" commands- we are "download" only.
anon_root=/pub
Clients see this as their "root" directory
listen=YES
Run as a self-contained daemon rather than from inetd.
Before starting the service, you should make sure that the file area exists and is world-readable. For example, if you're using "/pub", as I did in my xen0 session...
# mkdir -m 755 /pub
Once the service is configured and the file area is set up, you can start the service using this command:
# service vsftpd start
And to make sure it runs automatically when the system boots...
# chkconfig --level 345 vsftpd on
You should obviously test it before relying on it. You should be able to access it using the configured IP address, but not via 127.0.0.1 or any other IP addresses which may be configured on the machine. It should allow you to log in as "anonymous" or "ftp" (which means the same thing) but not as a normal user, even using the correct password. And of course it should not allow you to do any command which involves writing data (i.e. PUT, MKDIR, DELETE, RMDIR, etc.)
The last piece, of course, is to make the clients USE your repository. The CentOS installer sets up the new machines to download updates from the CentOS mirrors on the network. It does this by creating ".repo" files which tell "yum" where to find the repositories (or where to find a list of mirrors.)
To make the clients use your repository, you need to create a .repo file which points to your new FTP service, install that in the clients' filesystems, and rename or remove the default .repo files that CentOS creates.
On my own server, I rename the default .repo files so that their names end with ".repo.not". Since yum only uses files whose names end with .repo, this disables those repositories, but it does so in a way which leaves the option open for a client who wants to use the standard network repositories for some reason.
The only remaining piece is the .repo file pointing to your FTP server. On my server, the "/etc/yum.repos.d/jms1.repo" file looks like this:
[base]
name=CentOS-$releasever - Base
baseurl=ftp://192.168.250.162/centos/5.0/os/$basearch/
[updates]
name=CentOS-$releasever - Updates
baseurl=ftp://192.168.250.162/centos/5.0/updates/$basearch/
That's really all there is to it. You should be able to run any yum command as you normally would, but it will run much more quickly because the files it needs are not coming from across the Internet.
I'm running the FTP service on my xen0 session. However, nothing says you have to do it this way- you can set this up on any session on the machine.
Don't forget that the session which is running the FTP service can (and should) also be configured to pull updates from that FTP server, just like any other session on the machine.