Recursively Find/Replace Inside Files Within a Directory

We recently had to change a handful of usernames in LDAP due to a merging of resources. This was a relatively painless process, but since some services use static authorization files to grant access, some manual post-processing was necessary. The script at the end of this post is something I came up with to deal with updating the subversion auth_files. It’s a bash script that uses a couple useful tricks:

tree -ifF --noreport /path/to/dir/ | grep -v '/$'
  • The tree command normally prints out an ascii-graphical representation of the file structure rooted in the given path, recursively. The ‘-i’ option tells it not to display the graphics. The ‘-f’ option prints out the full path to each item. The ‘-F’ adds file-type indicators to the end of file names, which I’m using here so I can filter out directories from the list using an inverse grep.
  • The output of the tree command is piped to grep. The ‘-v’ option activates inverse grep, and the ‘/$’ regex will match trailing slashes. This grep will match all lines not ending in ‘/’.
  • The tree command is not standard on all flavors/versions of *nix. It’s missing on OS X, for example.
perl -p -i -e 's|before|after|[ig]' file
  • This perl command will edit a file in-place, replacing occurrences of “before” with “after”.
  • Adding an i to the end of the substitution string makes it a case insensitive substitution.
  • Adding a g makes the command replace all instances, a.k.a. global, instead of just the first instance.

Why perl instead of sed for in-place edits?

Not all versions of sed allow in-place edits, especially older ones, so perl is the more universal option. If you know your sed can do in-place edits (check the man page for the ‘-i’ option), then you can replace the perl line in the script below with this:

sed -i'' -e "s|$1|$2|g" $afile

Whether you choose to use perl or sed, you must remember to double-quote the substitution string so bash expands the variables and hands the values off to sed/perl. Using single quotes here would result in sed/perl looking for a literal ‘$1′ to replace with a literal ‘$2′.

The Script

This code can easily be repurposed for other tasks, but I present it here as I wrote it for the subversion auth_files purpose. (I named it “auth_find”.)

#!/bin/bash

workpath=/opt/auth_files/
outfile=/root/auth_find-$1

# friendly usage funtion, called if no argument is supplied
usage ()
{
    echo ""
    echo "Usage: auth_find [username] [new-username]"
    echo ""
    echo "This script recursively searches subversion's /opt/auth_files/ directory for"
    echo "the supplied username and returns a list of files that contain it. If a second"
    echo "username is supplied all instances of the first will be replaced with the second."
    echo ""
    echo "Output is sent to both STDOUT and /root/auth_find-username."
    echo ""
    exit 1
}

if [ $# == 1 ]; then    # do this block if one argument is given
    echo "Results:"
    for afile in $(tree -ifF --noreport $workpath | grep -v '/$'); do
        if [ -n "$(grep "^$1 " $afile)" ]; then
            echo "$afile" | tee -a $outfile
        fi
    done
else
    if [ $# == 2 ]; then    # do this block if two arguments are given
        echo "Now replacing occurrences of '$1' with '$2' in the following files:"
        for afile in $(tree -ifF --noreport $workpath | grep -v '/$'); do
            if [ -n "$(grep "^$1 " $afile)" ]; then
                echo "$afile" | tee -a $outfile-CHANGED
                perl -p -i -e "s|$1|$2|g" $afile
            fi
        done
    else    # show usage if incorrect number of arguments given
        usage
    fi
fi
#!/bin/bash
outfile=/root/auth_find-$1

UPDATE (8/31/09): Added “why perl instead of sed” section in response to comment.

Can I boot Snow Leopard in 64-bit mode?

UPDATE: Please read Update 2 at the bottom of this post before using a 64-bit kernel as your default.


With Snow Leopard making its appearance this Friday, August 28, 2009, some people may be wondering whether they’ll be able to boot their Macs in 64-bit mode. Only Intel Xserves will boot this way by default. If you want to boot your desktop or mobile Mac in 64-bit mode, you’ll need to take some additional steps. The first is checking to see if your Mac has a 64-bit-capable EFI. If the output of the following command is EFI64, you’re good. If not, you’re out of luck.

    ioreg -l -p IODeviceTree | awk -F'"' '/firmware-abi/{print $4}'

Once you’ve verified it’s possible, you have a couple options for making your Mac boot into 64-bit mode. I’d try them in this order. First, to affect the current boot only, hold down the ‘6′ and ‘4′ keys during bootup. Once you’ve verified it works and are comfortable with it, you can make the change permanent by adding an ‘arch=x86_64′ boot flag to your com.apple.Boot.plist, like so:

    sudo defaults write /Library/Preferences/SystemConfiguration/com.apple.Boot 'Kernel Flags' 'arch=x86_64'

UPDATE 1 (8/28/09): Apple has a couple new (and one older) knowledge-base articles pertaining to this topic.

  1. Mac OS X Server v10.6: Macs that use the 64-bit kernel
  2. Mac OS X Server v10.6: Starting up with the 32-bit or 64-bit kernel
  3. How to tell if your Intel-based Mac has a 32-bit or 64-bit processor

UPDATE 2 (8/29/09): This post has received quite a few hits, so I now feel the need to include some educational material about why Apple chose to make Snow Leopard boot with a 32-bit kernel by default.

The primary reason is for compatibility with third-party software, particularly software that requires kernel extensions. Probably the most widely know examples of software that depends upon kernel extensions, or kexts, are VMware Fusion and Parallels. If you use these to run Windows or Linux on your Mac, you’ll want to keep using a 32-bit kernel. Virtualization software needs direct access to the hardware normally controlled by the kernel (CPU, RAM, Disk) in order to “fool” operating systems into thinking they’re installed on “real” computers. The kernel extensions allow them to do this.

Kexts must be written specifically for 32-bit or 64-bit kernels. They are not interchangeable. Applications, on the other hand, can run at 64-bit even if the kernel is 32-bit. As far as your 64-bit CPU is concerned, the kernel is just another application. It’s a very important application — in the sense that it is code that is executed on a processor — whose job it is to arbitrate demands on the system’s resources. Most applications don’t have direct access to the CPU, RAM, or other physical devices, but make requests of the kernel instead.


UPDATE 3 (9/1/09): John Siracusa’s new article on Snow Leopard was posted today. Then entire thing is great reading, but I’m linking to the section that addresses 64-bit vs 32-bit here.

Boost VirtualBox disk I/O for Windows VMs

I picked up a VirtualBox Windows VM optimization tip from the MacEnterprise mailing list this morning, supplied by Yadin Flammer. Yadin mentioned that switching your Windows VM’s disk type from the default IDE to SATA and using the Intel Matrix Storage drivers results in faster hardware emulation. I decided to verify this claim by collecting some before and after I/O data. I have a Windows XP VM, but this should apply to all versions of Windows from 2000 onward, both server and client.

I used the freely-available Iometer to gather my disk I/O data. Both the before and after tests were run for 5 minutes on an ~ 3GB test file using the All-In-One test suite. Here are the results.

iops-chartAs you can see, the data clearly shows a slight increase in performance. Using SATA is actually recommended by Sun, as well. On this page, they say,

Like a real SATA controller, VirtualBox’s virtual SATA controller operates faster and also consumes less CPU resources than the virtual IDE controller. Also, this allows you to connect more than three virtual hard disks to the machine.

Makes sense, no? In terms of MBps, my IDE test averaged 17.925 while my SATA test averaged 18.828. Now that we know it’s better, we’ll move on to the installation and configuration procedure.

Installation and Configuration

vm-settings

  1. Shut down your Windows VM and open its settings window (shown above).
  2. Select the Hard Disks item, check Enable Additional Controller and choose SATA (AHCI) from the list.
  3. Leave the Hard disk attached to the IDE Controller in the Attachments section for now, since we’ll first have to install the SATA drivers, and click OK.
  4. Start up your VM again and download the Intel Matrix Storage Manager drivers. Click the link, select your Windows OS version, click Go, and then click the first download link in the Drivers section. Mine showed up as link #1. Save it to your desktop, and then install it. NOTE: If you are given a warning about not meeting the minimum installation requirements, you may need to download and install the Intel Chipset Software Installation Utility first. Follow the same download and install procedure as for the storage drivers.
  5. Once the drivers are installed, shut down your Windows VM and open its settings window.
  6. Select the Hard Disks item as before, but now select SATA Port 0 next to your VM’s .vdi file in the Attachments section.
  7. Click OK and then start up your VM.

… And boom goes the dynamite.

Print PDFs as Postscript to an lpr Queue

I wrote a simple script recently for a user who was having trouble getting certain PDFs to print properly from his linux box (Fedora 10). I first suggested that he try converting the pdfs to ps and printing the resulting file. That worked but he found the process a bit tedious. Here’s the script I wrote to take care of the tediousness. It relies on the standard (in Fedora, at least) pdf2ps package. It should be pretty self-explanatory.

#!/bin/bash

# grab first argument as pdf filename and generate ps filename
thePDF=$1
thePS=$(echo $thePDF.ps)
queueName=$2

usage()
{
    echo ""
    echo "Usage: psprint [your pdf] [lpr queue]*"
    echo ""
    echo "This command does three things:"
    echo "  1. Converts the specified pdf file to ps"
    echo "  2. Prints the ps file to your default lpr queue *(unless you specify another queue)"
    echo "  3. Deletes the ps file"
    echo ""
    exit 1
}

if [ $# == 0 ]; then usage; fi
if [ $# == 1 ]; then
    echo "Converting $thePDF ..."
    pdf2ps "$thePDF" "$thePS"
    echo "Sending to default printer ..."
    lpr "$thePS"
    echo "Cleaning up ..."
    rm "$thePS"
    exit 1
fi
if [ $# == 2 ]; then
    echo "Converting $thePDF ..."
    pdf2ps "$thePDF" "$thePS"
    echo "Sending to $2 ..."
    lpr -P "$queueName" "$thePS"
    echo "Cleaning up ..."
    rm "$thePS"
    exit 1
fi
if [ $# > 2 ]; then usage; fi

Save the script to a location in your path (/usr/local/bin works) and you’re off.