High-availability ad absurdum – instant messenger cluster using DRBD and Finch (Pidgin)

It is often unjustifiable said that implementing high-availability under Linux is way too complex. Of course you will have to be patient while spending some time in learning the required basics – but all this is feasible for an experienced administrator (or someone who wants to be such an administrator some day). This example shows how easy a simple 2-node cluster can be built.

When it is necessary to keep data synchronous between multiple hosts, implementing a DRBD (Distributed Replicated Block Device) might be the most elegant and easiest solution.

You can attach a dedicated LUN to your servers and file the application (which shall be protected using the cluster) on it. In combination with heartbeat, Pacemaker or similiar HA solutions, DRBD is the core of high-available Linux applications.

Ths concept is simple and brilliant at the same time – there are no bounds to the wide range of possibilities like the following slightly escapist example shows.

High-availability ad absurdum

If you’re communicating online you surely know plenty of instant messengers, including the three well-known ones Skype, ICQ and Jabber. To use these protocols also under Linux there are multiprotocol messengers like Pidgin. There is a special version of Pidgin which uses a curses command-line instead of a graphical user interface – Finch. This tools is used as mission-critical application in an active/passive heartbeat cluster in this example. The result is a high-available instant messenger that automatically fails over to another node (in another fire area) without loosing its configuration and data. I’m sure that some readers will now start thinking about how they could live without such an application before. :)

AufbauIn this example two Raspberry Pi are used behind two conventional DSL routers. Thanks to a port forwarding (which has to be created equivalently on both routers) it is possible to access those hosts using SSH from “out there” (WAN) – you might want to choose a more secure port instead of the standard port 22. Using a tool named GNU screen a started terminal session running finch can be continued at any time – using this you have access to your personal “chat shell” from every host connected to the internet. Both routers are connected to a IPsec VPN in this example. The Raspberry Pi are able to ping and communicate with each other using a secured tunnel – even though they are in two different network segments.

Using a tool called heartbeat the nodes are checked for availability later – for this mechanism the VPN between the routers is used – another possibility is to implement a point-to-point VPN between the nodes (e.g. using OpenVPN). If a cluster node fails, the other node is informed about the failure and takes over access to the shared storage (discussed later!) and restarts the application as soon as possible (active/passive cluster principle).

Beside the two Raspberry Pi two USB sticks are needed as replicated block device – DRBD doesn’t like pseudo-devices like files created with dd. Application configuration and protocol files of Finch are saved on this “cluster disk” so that the application always has the same data – independent of the node it’s currently running on.

Design and network

For this example I registered a NoIP hostname – after registration the dynamic hostname can be updated using a special Linux utility provided by NoIP. This tool needs to be compiled and installed on the two Raspberry Pi:

both-nodes # apt-get install gcc curl
both-nodes # wget http://www.no-ip.com/client/linux/noip-duc-linux.tar.gz
both-nodes # tar xfz noip-duc-linux.tar.gz
both-nodes # cd noip-*
both-nodes # make && make install
...
Please enter the login/email string for no-ip.com  
Please enter the password for user '...'  ***
...
Please enter an update interval:[30]  44640
...

By default, the NoIP client is running in the background – and that’s exactly what we don’t want in this cluster setup. The IP needs to be updated in case of a failover performed by heartbeat. If the applications fails over from one node to another the hostname needs to be updated to ensure that access to the correct node is possible. Testing the compiled tool is necessary to guarantee that it is working as expected:

any-node # /usr/local/bin/noip2 -i $(curl --silent http://icanhazip.com)
any-node # ping chat.noip.com

Additional packages for drbd and heartbeat have to be installed:

both-nodes # apt-get install drbd8-utils heartbeat

DRBD

Before the shared cluster storage is created, it is necessary to ensure that the both nodes are able to communicate with each other. It is recommended to create a local entry in the /etc/hosts file (to be independent of a possibly faulty DNS service) – even if you have a working DNS. After that pinging the nodes has to work:

both-nodes # vi /etc/hosts
....

192.168.1.2   hostA.fqdn.dom hostA
192.168.2.2   hostB.fqdn.dom hostB

ESC ZZ

node-a # ping hostA
node-a # ping hostB
node-b # ping hostA
node-b # ping hostB

The connected USB sticks are re-partitioned (existing partitions are removed) – a Linux partition (type 83) is created. After that the DRBD configuration file is altered:

both-nodes # fdisk /dev/sda < < EOF
d
4
d
3
d
2
d
1

n
p
1

w
EOF

both-nodes # cp /etc/drbd.conf /etc/drbd.conf.initial
both-nodes # vim /etc/drbd.conf
...
resource drbd1 {
  protocol C;

  syncer {
    rate 75K;
    al-extents 257;
  }
  on hostA.fqdn.dom {
    device    /dev/drbd1;
    disk      /dev/sda1;
    address   192.168.1.2:7789;
    meta-disk internal;
  }
  on hostA.fqdn.dom {
    device    /dev/drbd1;
    disk      /dev/sda1;
    address   192.168.2.2:7789;
    meta-disk internal;
  }

}

A volume drbd1 which is respectively synchronized to the device /dev/sda1 on the nodes hostA.fqdn.com and hostB.fqdn.com is defined.

The following line is very important:

rate 75K;

This line defines the maximal synchronization rate in byte per second – in this case 75 KB/s. The synchronization is speed depends on different factors and shall be choosed carefully. For example, the defined speed rate should not exceed the maximal speed provided by the used storage medium. If DRBD and application use the same network segment (it is better to have an additional network for DRBD) you will have to consider the needs of the other network traffic.

A rule of thumb is to set the value of syncer rate to the 0.3 times of the effective network bandwidth – there is an example in the drbd handbook:

110 MB/s bandwidth * 0.3 = 33 MB/s
...
syncer rate = 33M;

If the network isn’t able to provide the definied maximal speed the speed is automatically throttled. More information about synchronization can be found in the handbook of DRBD: [click me!].

Afterwards the volume is created, activated and formatted on one node. After this, the new volume is mounted:

node-a # drbdadm create-md drbd1

==> This might destroy existing data! < ==
Do you want to proceed? [need to type 'yes' to confirm] yes ...

node-a # drbdadm up drbd1
node-a # drbdadm primary drbd1
node-a # mkfs.ext4 /dev/drbd1
both-nodes # mkdir /finch
node-a # mount /dev/drbd1 /finch

The changes are catched up on the other node – this can take some time for the first initialization (time for a coffee!). In my case the initialization of a 256 MB USB stick took about one hour using a VPN with 75 kbit/s upload rate. The status can be seen by reading the file /proc/drbd:

node-b # drbdadm connect drbd1
node-b # uptime
 16:40:30 up  2:08,  1 user,  load average: 0,20, 0,16, 0,10

both-nodes # cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
srcversion: A9694A3AC4D985F53813A23

 1: cs:SyncTarget ro:Secondary/Primary ds:Inconsistent/UpToDate C r-----
    ns:0 nr:104320 dw:104320 dr:0 al:0 bm:6 lo:1 pe:2321 ua:0 ap:0 ep:1 wo:f oos:148564
        [=======>............] sync'ed: 42.0% (148564/252884)K
        finish: 0:32:11 speed: 64 (24) want: 71,680 K/sec

Once the synchronization is finished, it is necessary to check whether changes are replicated and roles can be switched. To check this, the following tasks are executed:

  • creating a file on the primary DRBD node, creating MD5 sum
  • unmounting the file system, downgrading to secondary role
  • upgrading the second DRBD node, mounting file system
  • find file, check MD5 sum
  • delete file and create another file
  • resetting the roles

In this example only the primary node is able to access the volume exclusively, the secondary node has no access to the volume. If you want both nodes to be able to access the volume (e.g. if you’re implementing an active/active cluster) ext4 is not the file system to choose. In such a scenario you will have to choose a cluster filesystem like GFS or OCFS2. These filesystems offer special locking mechanisms to manage the access to the volume for the individual nodes.

node-a # dd if=/dev/zero of=/finch/bla.bin bs=1024k count=1
node-a # md5sum /finch/bla.bin > /finch/bla.bin.md5sum
node-a # umount /finch
node-a # drbdadm secondary drbd1

node-b # drbdadm primary drbd1
node-b # mount /dev/drbd1 /finch
node-b # ls /finch
lost+found    bla.bin    bla.bin.md5sum
node-b # md5sum -c /finch/bla.bin.md5sum
/finch/bla.bin: OK
node-b # rm /finch/bla.bin*
node-b # dd if=/dev/zero of=/finch/foo.bin bs=1024k count=1
node-b # md5sum /finch/foo.bin > /finch/foo.bin.md5sum
node-b # umount /finch
node-b # drbdadm secondary drbd1

node-a # drbdadm primary drbd1
node-a # mount /dev/drbd1 /finch
node-a # ls /finch
lost+found    foo.bin    foo.bin.md5sum
node-a # md5sum -c /finch/foo.bin.md5sum
/finch/foo.bin: OK

Seems to work like a charm! :)

Finch

Like mentioned before, the multi-protocol messenger Finch (Pidgin) is used as mission-critical application in this scenario. It is a proper behaviour to provide a dedicated service user and a unique UID in order to ensure that the application can be started in a script executed by heartbeat on a node afterwards. To start the tool in the background and enable it to be access remotely, GNU Screen is used as terminal multiplexer.

both-nodes # apt-get install screen finch
both-nodes # useradd -u 1337 -m -d /finch/home su-finch
both-nodes # gpasswd -a su-finch tty
both-nodes # passwd su-finch

Adding the user su-finch to the group tty is necessary to ensure that GNU screen is able to access the terminal if it is started using the su mechanism.

First of all finch can be configured to use an instant messenger account:

node-a # su - su-finch
node-a # screen
node-a # finch

finch (curses)If you have used Pidgin before, you might recognize the buddy list and chat windows.

Windows are switched using key combinations – in combination with GNU screen it is also possible to use mouse navigation. Some important pre-defined key combinations:

  • next window: ALT + N
  • previous window: ALT + P
  • close selected window: ALT + C
  • open context menu: F11
  • open action menu: ALT + A
  • open menu of current window: CTRL + P

heartbeat – computer, are you still alive?

heartbeat is – as its name implies – primarly used for ensuring the availability of the particular cluster nodes. The tool communicates constantly with neighbor cluster nodes using a encrypted tunnel and is able to react fast on failures. If such a failure occurs. pre-defined scripts are executed to compensate the failure. In this example two cluster resources are restarted on the next available node: the shared DRBD storage and the service finch.

STONITHIt might be necessary in a cluster to ensure that faulty nodes wont continue serving their services (depending on size and application) to avoid data corruption and service misfunction. This mechanism is called STONITH (Shoot the other node in the head). There are plenty of interfaces which can be used to make sure that the faulty cluster node “keeps down” – for example:

  • servers remote interface (iLO, DRAC, LOM,…)
  • UPS the server is connected to
  • PDU (Power Distribution Unit) the cluster node is consuming power from
  • blade enclosure management interface

If STONITH is not used, data corruption and application failure might occure in extreme cases when both cluster nodes are hold that they are the only active node and run the application (e.g. because of a network failure). Of course STONITH can also be implemented under Linux – amongst others using heartbeat or Pacemaker. But in this example, this would go beyond the scope of this well-arranged scenario. ;-)

Applications can be served in a cluster using heartbeat very easy because conventional init scripts are used for service management. In an ideal case symbolic links can be used to integrate a service into a cluster.

In this example an used-defined init script which starts Finch and updates the NoIP hostname is created.

First of all, a cluster-wide configuration file (/etc/ha.d/ha.cf) is created. This configuration file includes essential parameters like log files and thresholds:

node-a # vi /etc/ha.d/ha.cf
debugfile /var/log/ha-debug
logfile /var/log/ha-log
logfacility local0
keepalive 10
deadtime 30
warntime 20
initdead 60
ucast eth0 192.168.2.2
udpport 694
auto_failback off
node hostA.fqdn.dom
node hostB.fqdn.dom

ESC ZZ

node-b # vi /etc/ha.d/ha.cf
...
ucast eth0 192.168.1.2
...

ESC ZZ

The file differs between the nodes in one line (ucast) – the appropriate IP address of the other node is used.

Some explanations of the individual parameters:

  • debugfile / logfile / logfacility – debug- and generic log, used Syslog facility
  • keepalive – time frame keepalives are sent
  • warntime – time frame nodes are threaten to fail
  • deadtime – time frame a node seems to be dead
  • initdead – time frame a node is removed from the cluster
  • ucast – IP address, unicast heartbeat packages are sent to
  • udpport – UDP port
  • auto_failback – defines whether failed cluster nodes shall receive their former resources (if they were preferred nodes for particular resources)
  • node – defines the given cluster nodes

Because the cluster communication is done encrypted, a file /etc/ha.d/authkeys has to be created on both nodes.:

both-nodes # vi /etc/ha.d/authkeys
auth 1
1 sha1 verylongandultrasavepasswordphrase08151337666

Afterwards the cluster resources are mentioned in the file /etc/ha.d/haresources:

both-nodes # vi /etc/ha.d/haresources
hostA.fqdn.dom
hostB.fqdn.dom  drbddisk::drbd1 Filesystem::/dev/drbd1::/finch::ext4    finch

This file lists all available cluster nodes and – separated using a tab – preferred resources. In this example there are two hosts, the second one is the primary cluster node for the following resources:

  • exclusive used DRBD volume drbd1
  • ext4 file system on /dev/drbd1, mounted as /finch
  • the service finch (/etc/init.d/finch)

This means: if both cluster nodes are available, the above mentioned resources are always running on node 2 (hostB.fqdn.dom). If this node fails, the resouces are restarted on node 1 (hostA.fqdn.dom). Automatic resource restarting after the failed cluster node becomes available again is avoided because of the setting “auto_failback off” (in the file /etc/ha.d/ha.cf).

At last the init script for starting and stopping the finch service needs to be created – like other init scripts, this script is created under /etc/init.d/finch:

# vi /etc/init.d/finch
#!/bin/bash
#
# finch        Startup script for finch including noip update
#

start() {
        /usr/local/bin/noip2 -i $(curl --silent http://icanhazip.com) >/dev/null 2>&1
        chmod g+rw $(tty)
        su -c "screen -d -m" su-finch
        RESULT=$?
        return $RESULT
}
stop() {
        /usr/bin/killall -u su-finch
        RESULT=$?
        return $RESULT
}
status() {
        su -c "screen -ls" su-finch
        cat /proc/drbd
        dig +short foo.noip.com
        RESULT=$?
        return $RESULT
}

case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  status)
        status
        ;;
  *)
        echo $"Usage: finch {start|stop|status}"
        exit 1
esac

exit $RESULT

This init script recognizes the parameters start, stop and status – it might almost be LSB-compatible. :)

Depending on the parameter a GNU screen sessing is started or stopped using the user account of su-finch (service user) – the status parameter lists current DRBD and IP mappings including current sessions.

Whenever heartbeat starts or stops finch ressources, this script is used.

Function test

Ok, this sounds nice – but is it working at all?

Of course it is – the following video shows a demonstration of cluster failover:

:D

Monitoring

Clusternode-Überwachung mit IcingaIt is a proper behaviour to monitor a mission-criticial application. Beside the availability of the appropriate cluster nodes, the state of DRBD and heartbeat is of note, too. While the availability of the heartbeat service can be checked easily using the Nagios/Icinga plugin check_procs, there is a special shell script for DRBD (free to download) on the website MonitoringExchange: [click me!]

This script can be included into Nagios or Icinga and used easily, e.g. in this example on a passive Icinga instance:

# cat /etc/icinga/commands.cfg
...
# 'check_drbd' command definition
define command{
        command_name    check_drbd
        command_line    $USER2$/check_drbd -d $ARG1$ -e "Connected" -o "UpToDate" -w "SyncingAll" -c "Inconsistent"
}

# cat /etc/icinga/objects/hostA.cfg
...
define service{
        use                             generic-service
        host_name                       hostA
        service_description             HW: drbd1
        check_command                   check_drbd!1
        }

In this example the available of the DRBD volume /dev/drbd1 is checked. If the script answer (based on the content of the file /proc/drbd) is not “Connected/UpToDate“, a failure has occured. If the volume is synchronizing (SyncingAll), Nagios/Icinga reports a warning – an inconsistent volume (Inconsistent) forces a critical event.

Conclusion

It is no doubt that this scenario is rather escapist than realistic – it is just for fun. I just wanted to show how easy implementing high-availability under Linux can be. There are many ways to get it working – heartbeat and DRBD is only one of many HA constellations. The topic isn’t that complex as often unjustifitable said. In the first step it is irrelevant whether a database or an instant messenger is “clustered” using heartbeat – the implementation effort is manageable.

heartbeat is said to be obsolete – Pacemaker and OpenAIS/Corosync are two more modern utitilies that can be used in combination with DRBD to implement more complex and larger HA scenarios.

As a matter of principle hardware components should be designed redundant before implementing software HA solutions. In this example, there are some architecture mistakes that should be fixed in case of practical application:

  • no dedicated network for node communication (heartbeat network)
  • no redundant storage for cluster storage (RAID volume)
  • network adapters are not redundant (no double NICs/teaming or appropriate connected switches; LACP?)
  • no redundant power supply

At least dedicaded fire areas had been chosen for this scenario (the Raspberry Pi are located in two different flats)! :)

However – if you’re interested in keeping your instant messenger redundant, you know how to do this now. ;-)

iOS and IPCop/IPFire OpenVPN

OpenVPN-Profile

OpenVPN-Profile

OpenVPN Connect is a good OpenVPN client for iOS devices with version 5.0 or higher.

Using this app VPN tunnels can be managed and used comfortably. Unfortunately the respective OpenVPN configuration files can’t be edited directly on the iPhone, iPod or iPad like in the Android application. The first setup might be more complex because you’ll have to modify the configuration files on a computer and copy them to the device using iTunes afterwards.

Beyond that there are some additional restrictions:

  • Certificates need to be integrated in the configuration file
  • TAP devices are currently not working
  • Error messages while managing certificates can’t be scrolled and wont fit on the screen in vertical mode

The appropriate iOS OpenVPN configuration varies based on your server configuration – like mentioned above, TAP configurations aren’t working currently.

I’m using OpenVPN with an IPCop router. This router uses TUN and certificates for users and CA by default. In this setup it is necessary to extract the user and CA certificates (requires an installed OpenSSL distribution) to include the certificates into the OpenVPN configuration afterwards:

# openssl pkcs12 -in name.p12 -nocerts -nodes -out keys.pem
Enter Import Password:
MAC verified OK
# openssl pkcs12 -in name.p12 -cacerts -nodes -out ca.pem
Enter Import Password:
MAC verified OK
# openssl pkcs12 -in name.p12 -out name.pem
Enter Import Password:
MAC verified OK
Enter PEM pass phrase:

The client ZIP provided by IPCop / IPFire contains a configuration file that needs to be modified. The mentioned crypto archive (pkcs12) needs to be removed or commented out – the previously extracted certificates are added in XML syntax:

#OpenVPN Server conf
tls-client
client
dev tun
proto udp
tun-mtu 1400
remote HOSTNAME PORT
#pkcs12 name.p12
cipher BF-CBC
verb 3
ns-cert-type server

#ca.pem
<ca>
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
</ca>
#name.pem
<cert>
-----BEGIN CERTIFICATE-----
....
-----END CERTIFICATE-----
</cert>
#keys.pem
<key>
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----
</key>

Afterwards the OpenVPN configuration is saved and copied to the iOS device using iTunes. After installing OpenVPN Connect there should be a appropriate tab in iTunes (don’t forget to scroll down!).

OpenVPN-Log

OpenVPN-Log

Using drag & drop the file can be transfered easily. On your iOS device the new profile is recognized and imported after confirmation.

Connection establishments are logged automatically – if there’s a problem with connecting you’re able to get behind the reason. Active VPN connections are advertised like IPSec and PPTP tunnels using the well-known VPN icon in the statusbar. :)

Default gateway ignored under RHEL / CentOS 5.3

On RHEL or CentOS 5.3 it is possible that a noticed default gateway is ignored. In this case the routing table doesn’t contain any appropriate entry…

# netstat -r|grep default

…even if the gateway was noticed both in the main network configuration…

# cat /etc/sysconfig/network
...
GATEWAY=10.24.36.1

…and interface configuration:

# cat /etc/sysconfig/network-scripts/ifcfg-eth0
...
GATEWAY=10.24.36.1

To fix this issue it is necessary to add the following entry to a routing file:

# vi /etc/sysconfig/network-scripts/route-eth0
default via 10.24.36.1 dev eth0 onlink

After restarting the network the default gateway is used:

# service network restart
# netstat -r|grep default
default       10.24.36.1     0.0.0.0     UG    0 0     0 eth0

:)

CDE under Debian Squeeze

Many administrators or IT enthusiasts still might know Common Desktop Environment (CDE) from old UNIX days. Introduced in 1993, it was Unix’s standard desktop for HP-UX, IBM AIX, Sun Solaris and Tru64 for more than 10 years. Even if Solaris dropped its CDE support 3 years ago, the old desktop is still used in HP-UX, AIX and OpenVMS.

A petition for publishing CDE’s source code was started in 2006. After 6 years, the desired source code was published in september 2012. Currently, there’s an alpha version which can be compiled under Linux.

Debian Squeeze is mentioned in the list of currently supported Linux distros – it’s time to have a look at it, isn’t it? ;-)

Preparation

If not installed yet, you’ll have to install HAL, DBUS and the X-server core components:

# apt-get install xserver-xorg-core xinit dbus hal xserver-xorg-input-kbd xserver-xorg-input-mouse xserver-xorg-input-vmmouse

CDE requires additional development tools and libraries – they have to be installed, too:

# apt-get install git libxp-dev libxt-dev libxmu-dev libxft-dev libxaw7-dev libx11-dev libjpeg-dev libjpeg62-dev libfreetype6-dev lesstif2 x11-xserver-utils ksh m4 ncompress xfonts-100dpi* rpcbind bison xbitmaps libmotif* x11-xserver-utils tcl-dev lprng

Compiling

You can retrieve the source code using a GIT mirror or tar snapshot. Afterwards you’ll have to link the X-server header files before compiling CDE. After compiling some installation and configuration scripts have to be executed before creating a spool directory for the calendar application of CDE:

You can find additional information in the official build instructions: [click me!]

# w3m http://downloads.sourceforge.net/project/cdesktopenv/src/cde-src-2.2.0c-alpha.tar.gz
# tar xfz cde-src-2.2.0c-alpha.tar.gz
# cd cdesktopenv-code/cde
# mkdir -p imports/x11/include
# cd imports/x11/include
# ln -s /usr/include/X11 .
# cd cdesktopenv-code/cde
# make World
# admin/IntegTools/dbTools/installCDE -s /path-to-cdesktopenv-code/cde/
# admin/IntegTools/post_install/linux/configRun -e
# chmod -R a+rwx /var/dt
# mkdir -p /usr/spool/calendar

Display manager

You can either use the Motif-based dtlogin display manager…

# export PATH=$PATH:/usr/dt/bin
# LANG=C /usr/dt/bin/dtlogin

…or configure another display manager like SLiM to start CDE:

# apt-get install slim
# vi /etc/slim.conf
...
#login_cmd           exec /bin/bash -login /etc/X11/Xsession %session
login_cmd        exec /bin/bash -login /usr/dt/bin/startxsession.sh

ESC ZZ

# vi /usr/dt/bin/startxsession.sh
#!/bin/sh
export PATH=$PATH:/usr/dt/bin
export LANG=C
/usr/dt/bin/Xsession

ESC ZZ

# service slim start

Gallery

Attached some screenshots of the retro desktop:

Differences between Spacewalk, Red Hat Network Satellite and SUSE Manager

Red Hat Network Satellite and SUSE Manager are two management suites for the Linux enterprise distros Red Hat Enterprise Linux and SUSE Linux Enterprise Server.

At first sight these products look very similar to each other and there are also technical analogies because the products are based on the same core code: Red Hat Spacewalk. Red Hat Spacewalk was released as open source by Red Hat in 2008 – it is also the base platform for the commercial Satellite server.

So, what’s the difference between these products? The following table shows the commonalities and differences:

Spacewalk RHN Satellite SUSE Manager
Version 1.9 5.5 1.7
Link [click me!] [click me!] [click me!]
Pricing free Subscription/module
Management of Fedora, CentOS, SUSE, Debian RHEL, Solaris SUSE, RHEL* (see below)
Architectures i386, x86_64 i386, x86_64, s390x i386, x86_64, ia64, s390x, ppc, ppc64
Database Postgres, Oracle Oracle 10gR2/11g Postgres, Oracle 10gR2/11g
Functions
(shortened)
  • Logical grouping of hosts and software channels
  • Software and patch management, serving distributor and custom software
  • Provisiong of physical and virtual hosts
  • Basic host monitoring
  • Compliance reporting and alerting
  • Proxy server, managed hosts don’t need a direct internet connection anymore

Central management of RHEL and SLES systems with one suite?

What are the possibilities if RHEL and SLES systems shall be operated parallel and managed centrally?

That is something I was thinking about, too and so I was researching in the internet and socialized with SUSE on Twitter.

Basically, there are two possibilities – find out for yourself if those are good solutions for you.

1.Possibility – SUSE Expanded Support

SUSE Expanded SupportBeside SLES patches SUSE also provides fixes for RHEL within the “SUSE Expanded Support“. This was designed for long migrations and should make the Red Hat support becoming obsolete because patches aren’t received directly through Red Hat Network afterwards. Finally, the software packages quality should be the same because the packages are based on the same source code – but I’d prefer to receive distro patches from the original distributor.

If you’re not planning a migration and just want to combine the best from both “worlds” this might not be the best solution for you.

2.Possibility – mrepo

SUSE Manager + mrepoAnother possibility would be using a tool called mrepo. This tool creates mirrors of YUM repositories and serves these RPM packages locally – the website says that this is working with RHN channels, too. It seems hard to believe that this is possible without breaking any license terms because downloading RPMs from RHN is knowingly complicated. This feature should only be possible using the RHN Satrellite server.

This might be a possible solution for testing purposes – but for productive environments where reproachless support is indispensable I’d advise against this solution.

Conclusion

Finally, the products RHN Satellite and SUSE Manager are quite similar (quasi “the same in green” :-P ) – unfortunately combing both “worlds” isn’t that easy.

This problem is not a technical but rather legal licensing issue like a SUSE employee told me recently. The reasons are understandably comprehensible – the manufacturers wan’t to sell their own product to stay competitive.

In my opinion you’ll have to migrate the other system landscape or use more money to implement both management tools parallel if you use both products and want a central management suite. An alternative would be using a trick to cache the RPM packages from Red Hat Network and share them. If it is not important for you where your systems patches come from you could also have a look at “SUSE Expanded Support“.

For me it is absolutely important to have reproachless support from the original distributor. In all cases I’d prefer to implement both mangement products.

Gallery

Attached some screenshots of the mangement suites.

Obsolete tools: nslookup & ifconfig

nslookup and ifconfig are two well-known tools for configuration the network of Unix/Linux hosts and checking whether DNS is working properly.

ifconfig was part of the 4.2BSD distribution in 1983 for the first time and quickly became the standard tool for network configuration – even commerical Unices like Solaris or HP-UX integrated the utility.

Some Linux distributions don’t use ifconfig anymore (e.g. ArchLinux) – other distributions (e.g. SuSE/SLES and Fedora) are advising that this tool will be missing someday:

# man ifconfig
...
       WARNING: Ifconfig is obsolete on system with Linux  kernel  newer  than
       2.0.  On  this  system  you  should  use ip. See the ip manual page for
       details
...
NOTE
       This program is obsolete! For replacement check ip addr and ip link.
       For statistics use ip -s link.

Solaris 11 introduced a new command for network configuration: ipadm (thanks for the hint, Prometheus!).

ip works beginning with Linux 2.2. It is more modern and also includes amongst others the functionality of the route and arp utilities.

Here are some of the most used ifconfig/route commands and their ip pendants:

Task ifconfig/route ip
Show all NICs
ifconfig
ip addr show
Show specific NIC
ifconfig eth0
ip addr show eth0
Disable NIC
ifconfig eth0 down
ip link set eth0 down
Enable NIC
ifconfig eth0 up
ip link set eth0 up
Assign IP
ifconfig eth0 [IP] netmask [NM]
ip addr [IP]/[CIDR] dev eth0
Show routing table
route
netstat -r
ip route
Set standard route
route add default gw [IP] eth0
ip route add default via [IP]

Of course ip has a lot of more features – just a few of them:

  • Enable/disable MTU and Promiscuous Mode
  • Configure Multicasting and VLANs
  • ARP table maintenance

Similar to ifconfig – nslookup is also obsolete for a long time. There are two replacement tools that are able to do more: host and dig. The open DNS server bind advises not to use nslookup (source):

Due to its arcane user interface and frequently inconsistent
behavior, we do not recommend the use of nslookup.
Use dig instead.

Here are some commonly used nslookup commands and their dig pendants.

Task nslookup dig
Forward lookup
nslookup google.de
dig google.de
dig +short google.de
Reverse lookup
nslookup [IP]
dig -x [IP]
dig +short -x [IP]
Use specific DNS server
nslookup google.de [DNS]
dig @[DNS] google.de
dig @[DNS] +short google.de
Ask for MX records
nslookup -query=mx google.de
dig google.de MX
dig +short google.de MX
Specific timeout
nslookup -timeout=42 google.de
dig google.de +time=42
dig +short google.de +time=42

The option +short ist really useful if you just want to get a IP or a hostname. By default, dig also prints some additional information like the IP and response time of the used DNS sever:

; <<>> DiG 9.9.2-P2 <<>> google.de
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33883
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 16384
;; QUESTION SECTION:
;google.de.                     IN      A

;; ANSWER SECTION:
google.de.              300     IN      A       173.194.44.56
google.de.              300     IN      A       173.194.44.55
google.de.              300     IN      A       173.194.44.63

;; Query time: 66 msec
;; SERVER: 208.67.222.222#53(208.67.222.222)
;; WHEN: Sun Apr  7 17:18:22 2013
;; MSG SIZE  rcvd: 86

# dig +short google.de
173.194.44.55
173.194.44.56
173.194.44.63

It’s also a good idea to have a look at the manpage of dig because it offers a lot of additional features in addition to the mentioned functions. :)

CRUX-ARM 2.8 on Raspberry Pi

Meanwhile there are plenty of operating systems available for the Raspberry Pi – including an ARM version of the source-based Linux distro CRUX.

So if you like handcrafting or think Raspbian is “too mainstream“, you can have a lot of fun with a SD card with at least 1 GB of memory and a pot of coffee. ;-)

Partitions and mounting

The following partitions have to be created on the SD card:

  • 1.Partition, later /dev/mmcblk0p1 – /boot, VFAT, at least 100 MB
  • 2.Partition, later /dev/mmcblk0p2 – /, ext3, at least 512 MB
  • 3.Partition, later /dev/mmcblk0p3 – swap, ideally 100-512 MB

It’s senseful to do this under Linux because archives have to be extracted on the SD card afterwards:

# fdisk /dev/sdX
...
# mkfs.vfat /dev/sdX1
# mkfs.ext3 /dev/sdX2
# mkswap /dev/sdX3
# mkdir -p /mnt/{b,r}oot
# mount /dev/sdX2 /mnt/root
# mount /dev/sdX1 /mnt/boot

Copying and configuring

The following files are copied to the FAT partition:

After all files are copied the root file system and kernel modules are extracted:

# tar -pxf /mnt/boot/crux-arm-rootfs-2.8-hardfp-raspberrypi.tar.xz -C /mnt/root
# tar -pxf /mnt/boot/modules-3.6.1-raspberrypi_20130305.tar.xz -C /mnt/root

Some lines of the file /etc/fstab are modified afterwards:

# vi /mnt/root/etc/fstab
...#/dev/#REISERFS_ROOT#  /         reiserfs  defaults               0      0
/dev/mmcblk0p2    /         ext3      defaults               0      1
/dev/mmcblk0p1          /boot   vfat    defaults        0       2
...
#/dev/#XFS_ROOT#       /         xfs       defaults               0      0
/dev/mmcblk0p3           swap      swap      defaults               0      0
...

ESC ZZ

The file cmdline.txt has to be modified like this:

# vi /mnt/boot/cmdline.txt
smsc95xx.turbo_mode=N dwc_otg.lpm_enable=0 console=tty0 root=/dev/mmcblk0p2 rootfstype=ext3 rootwait

ESC ZZ

After unmounting the SD card carefully CRUX-ARM 2.8 should be able to boot:

# cd
# umount /mnt/{b,r}oot
# eject /dev/sdX

;-)

Network configuration

The network configuration needs to be modified. By default, a static IP (192.168.1.100/24) is used which might not always be the best setting – e.g. if you’re using DHCP:

# vi /etc/rc.d/net
...
start)
        # loopback
        /sbin/ip addr add 127.0.0.1/8 dev lo broadcast + scope host
        /sbin/ip link set lo up
        # ethernet
        /sbin/ip link set eth0 up
        /sbin/dhcpcd eth0
        ;;
stop)
        /usr/bin/killall dhcpcd
        /sbin/ip route del default
        /sbin/ip link set eth0 down
        /sbin/ip link set lo down
        /sbin/ip addr del 127.0.0.1/8 dev lo
        ;;

ESC ZZ

After a restart of the network service, a connection should have been established:

# /etc/rc.d/net restart
# ping google.de

If not, have a deeper look at the error messages:

Updates

A couple of updates have been published after the creation of the CRUX-ARM 2.8 Raspberry Pi image – you should install them. It’s important that your date/time is set correctly to avoid that builds are canceled with an error message:

# date --set="3 Apr 2013 19:50:00"     # please replace!
# ports -u
# echo "Updates: $(ports -d|tail -n+2|wc -l)"
# prt-get sysup
...

Please be patient. The Raspberry Pi has to compile all updates – and this takes some time because of the used CPU. ;-)

Tools

Some essentials tools like GNU screen, elinks and ntp are missing in the minimal image of CRUX-ARM 2.8:

# useradd ntp; groupadd ntp
# prt-get depinst screen ntp elinks python

Please be patient – the tiny machine isn’t very fast and takes some time for compiling the source codes.

Tweaks

If you wan’t to, you can “tune” your Raspberry Pi by tuning the CPU, disabling overscan, etc.:

# vi /boot/config.txt
disable_overscan=1
arm_freq=950
gpu_mem=16
core_freq=250
sdram_freq=450

ESC ZZ

These settings are consistent as far as possible with the Raspbian “High” settings. The CPU is overclocked at 950 Mhz, Core uses a frequeny of 250 Mhz and the memory runs at 450 Mhz. Overscan is disabled and 16 MB are reserved for the GPU – ideal if you wan’t to use the Raspberry Pi as server.

New strategy objectives for Ubuntu: custom kernel, exclusive hardware and the cloud?

It’s doubtless that Ubuntu is one of the most innovative Linux distributions – it made Linux desktops becoming more user-friendly and – thanks to this – raised the acceptance for end users in the last years.

Currently the gossip factory is working overtime again – some reliable sources announced that some big strategy changes – that are the focus of this article – are pending to the distribution.

New engine: ARM-focused – no GNU/Linux for the first time?

According to insider information there is a middle-term objective to move to a new kernel platform. In the past the maintenance of the Linux kernel exposed as a very time-consuming and complex task. Special Ubuntu modifications have to be made later and new device drivers are often buggy which decreases the customer’s satisfaction.

In pursuance with internal analysis these problems are caused by the obsolete monolithic design of the Linux kernel. To solve this issue, first tests on alternative kernel architectures are currently taking place. It seems that a previously unknown unixoid hybrid kernel, which is based on the Mach principle and includes some monolithic elements, is quite convincing.

Another target is to limit the hardware support to some exclusive manufacturers – the products of there manufacturers are going to be supported 100%. Three manufacturers from Round Rock, Raleigh and Cupertino are supposed as potential contractual partners. The end user won’t have to worry about driver support anymore and could buy any product of the broard product portfolio of these manufacturers.

It seems that the classical 32-bit architecture i686 isn’t interesting anymore – in accordance with appropriate suggestions it might be discontinued with the upcoming release “13.10 Sloppy Seagul“. In the medium term this shall happen to the 64-bit architecture x86_64, too. It seems that this is a preparation for concentrating on supporting the ARM architecture, which is more interesting for the consumer market. According to representative studies and market analysis the sales figures of tablets and smartphones will be four times higher than those from conventional Personal Computers. It seems that it’s a good idea to get prepared for this trend by discharging all required structuring procedures.

Focusing the consumer market also affects the maintenance of Ubuntu Server – this product is going to be provided for ARM-based device only by 2014. The i686 and x86_64 support for desktop and server releases will probably be discontinued at the same time.

New release cycles and update mechanisms

The typical 6-month release cycle with additional 1,5 to 2 years update maintenance shall be replaced by a new system called “Short Term Support (STS)“. Security updates for future releases will be provided for up to 6 months.

The primary target of this re-structuring is to provide the most latest software. Thanks to the work reduction for additional hardware support (see above) there is more manpower to test and patch (if required) the latest software under different aspects.

Updates and additional applications won’t be downloadable using apt or aptitude anymore – this mechanism is going to be replaced with a subscription principle, which has also been established for some enterprise Linux distributions. These subscriptions can be bought online in a multimedia store which provides also movies and books. Additional applications are provided as purchasable “apps“.

Software harmonization

Next to the kernel maintenance there is another major target: the software harmonization of the broad software portfolio of Ubuntu.

Currently there are plently desktop environments for Ubuntu including GNOME, KDE and LXDE. This programm variety isn’t in accordance with the original rule of thumb, to provide one application for every task, anymore.

The above-named desktop environments are all having different agendas, advantages and disadvantages which are going to be combined.

After the work on Mir, a custom display server, was announced in march, it seems that the major objective also includes the development of a custom desktop environment. This environment is going to be named Ubuntu Desktop Environment (UDE) and persuades with a clean appearance. The environment isn’t designed to be controlled with a mouse and an external keyboard – instead it’s using a very mature voice detection which was developed together with Hessian and Swabian universities. Plenty internal tests were taking place satisfyingly and promised a global end-user enthusiasm.

Applications are going to be executed in fullscreen mode only – this is a principle which already known from smartphones and tables. A feature that will be missing is multitasking because users aren’t using more than one application at the same time (in accordance with internal analysis).

Off to the cloud: complete Ubuntu One integration

After cloud storage has been provided 2009 with Ubuntu One for the first time, is has became another objective to stop providing local user accounts with the open operating system anymore.

User accounts shall be implemented using the popular social network Facebook in the future. Because personal data is going to be saved exclusively on Ubuntu One instead of dedicated /home partitions, users won’t have to worry about complex data synchronization anymore. Users can login on any Ubuntu device using face recognition or a three-digit security code and access their sensitive data.

Connected USB storage devices are automatically grabbed into the cloud. Existing music and video data will be synchronized with appropriate online portals automatically.

Perspective: first screenshots and new release names

Release names have been selected and announced short-dated by now – for the first time plenty of upcoming code names have been selected before their release. Like before, a release name consists of a animal name and an additional adjective:

Release Code name
13.10 Sloppy Seagull
14.04 Trendy Turkey
14.10 Ubiquitous Unicorn
15.04 Violet Vulture
15.10 Wretched Worm

I found some first secret screenshots of the upcoming user interface “Ubuntu Desktop Environment (UDE)” on the internet – you can find them here: [click me!]

There’s also a first video of UDE: [click me!]

Short tip: postfix – SASL authentication failure: No worthy mechs found

If you install and configure a Postfix mail server for relaying mails using an external smarthost you might be puzzled about the following error message:

Mar 19 17:00:22 hostname01 postfix/smtp[2003]: warning: SASL authentication failure: No worthy mechs found

The reason for this issue can be really trivial – in my case I did a minimal installation of RHEL which came without SASL and the appropriate plain module.

Installing the needed libraries fixed the problem for me:

# yum install cyrus-sasl{,-plain}

If appropriate SASL modules are present you might have a deeper look at the Postfix main configuration file /etc/postfix/main.cf:

  • typing errors?
  • SASL SMTP password map missing or erroneous? (postmap?)
  • SASL SMTP  security options missing or erroneous?
  • sender_canonical missing?