Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The steps below result in two of the database server's CPUs handling ethernet interrupts during a mapping run, instead of just one.  However, CPU utilization is still skewed heavily toward one CPU.  We're going to have to observe what happens under a heavier load than a small mapping run on staging before making further determinations.  The I/O interrupt balancing done by irqbalance can not be put to a proper test without heavier I/O than we've been able to generate on staging.  The slideshow reference above indicates that irqbalance should balance block device interrupts.  I think that what it must actually do is periodically switch the SMP affinity based on which CPU is less burdened.

It's not clear what we will have to do when we want to upgrade our servers to Ubuntu 14.04 LTS.  (The relevant ones at the moment, the PostgreSQL database servers, are running 12.04.).  The Amazon Enhanced Networking document says that the ethernet driver that needs to be installed (ixgbevf) is not compatible with 14.04, and says that you can use the version of ixgbevf that is bundled with Ubuntu 14.04.  That sounds good, but I we would need to research what to do to upgrade with a self-installed version of this module already present.

Steps

  1. Install these apt packages:  dh-autoreconf, git, pkg-config, ethtool
  2. Install ixgbevf driver and enable enhanced networking: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html#enhanced-networking-ubuntu.  Due to the complexity of this step, and the fact that it requires the use of the AWS EC2 CLI, this has not been automated, given the limited time available to fix the immediate issues in front of us.
  3. Disable Ubuntu's irqbalance in /etc/default/irqbalance
  4. Compile and install irqbalance 1.0.9
  5. Add starting of irqbalance and smp_affinity commands to /etc/rc.local, having irqbalance ignore IRQs that we configure specifically.  The following code has been inserted into /etc/rc.local, but is not yet reflected in our automation.  The IRQs can differ between systems, so this has to be edited to suit the installation.

    Code Block
    languagebash
    rl=`runlevel | /usr/bin/awk '{print $2}'`
    
    if [ "$rl" -eq "2" ]; then
        # Run irqbalance to balance i/o interrupts.  We'll configure the kernel's
        # SMP IRQ affinity for ethernet interrupts, below.
        /usr/local/sbin/irqbalance --banirq=81 --banirq=82 || exit 1
        # Distribute interrupts for paired interrupt queues across 2 CPUS with hex
        # bitmask values.  Ethernet interrupts can not be loadbalanced between
        # multiple CPUs the way i/o interrupts can.  You have to use the ixgbevf
        # (enhanced networking) kernel module and have individual interrupt queues
        # be handled by specific CPUs.
        # ... eth0-TxRx-0, IRQ #81, 0001
        echo 1 > /proc/irq/81/smp_affinity
        # ... eth0-TxRx-1, IRQ #82, 0010
        echo 2 > /proc/irq/82/smp_affinity
    fi

...