Recursively Find/Replace Inside Files Within a Directory

We recently had to change a handful of usernames in LDAP due to a merging of resources. This was a relatively painless process, but since some services use static authorization files to grant access, some manual post-processing was necessary. The script at the end of this post is something I came up with to deal with updating the subversion auth_files. It’s a bash script that uses a couple useful tricks:

tree -ifF --noreport /path/to/dir/ | grep -v '/$'
  • The tree command normally prints out an ascii-graphical representation of the file structure rooted in the given path, recursively. The ‘-i’ option tells it not to display the graphics. The ‘-f’ option prints out the full path to each item. The ‘-F’ adds file-type indicators to the end of file names, which I’m using here so I can filter out directories from the list using an inverse grep.
  • The output of the tree command is piped to grep. The ‘-v’ option activates inverse grep, and the ‘/$’ regex will match trailing slashes. This grep will match all lines not ending in ‘/’.
  • The tree command is not standard on all flavors/versions of *nix. It’s missing on OS X, for example.
perl -p -i -e 's|before|after|[ig]' file
  • This perl command will edit a file in-place, replacing occurrences of “before” with “after”.
  • Adding an i to the end of the substitution string makes it a case insensitive substitution.
  • Adding a g makes the command replace all instances, a.k.a. global, instead of just the first instance.

Why perl instead of sed for in-place edits?

Not all versions of sed allow in-place edits, especially older ones, so perl is the more universal option. If you know your sed can do in-place edits (check the man page for the ‘-i’ option), then you can replace the perl line in the script below with this:

sed -i'' -e "s|$1|$2|g" $afile

Whether you choose to use perl or sed, you must remember to double-quote the substitution string so bash expands the variables and hands the values off to sed/perl. Using single quotes here would result in sed/perl looking for a literal ‘$1′ to replace with a literal ‘$2′.

The Script

This code can easily be repurposed for other tasks, but I present it here as I wrote it for the subversion auth_files purpose. (I named it “auth_find”.)

#!/bin/bash

workpath=/opt/auth_files/
outfile=/root/auth_find-$1

# friendly usage funtion, called if no argument is supplied
usage ()
{
    echo ""
    echo "Usage: auth_find [username] [new-username]"
    echo ""
    echo "This script recursively searches subversion's /opt/auth_files/ directory for"
    echo "the supplied username and returns a list of files that contain it. If a second"
    echo "username is supplied all instances of the first will be replaced with the second."
    echo ""
    echo "Output is sent to both STDOUT and /root/auth_find-username."
    echo ""
    exit 1
}

if [ $# == 1 ]; then    # do this block if one argument is given
    echo "Results:"
    for afile in $(tree -ifF --noreport $workpath | grep -v '/$'); do
        if [ -n "$(grep "^$1 " $afile)" ]; then
            echo "$afile" | tee -a $outfile
        fi
    done
else
    if [ $# == 2 ]; then    # do this block if two arguments are given
        echo "Now replacing occurrences of '$1' with '$2' in the following files:"
        for afile in $(tree -ifF --noreport $workpath | grep -v '/$'); do
            if [ -n "$(grep "^$1 " $afile)" ]; then
                echo "$afile" | tee -a $outfile-CHANGED
                perl -p -i -e "s|$1|$2|g" $afile
            fi
        done
    else    # show usage if incorrect number of arguments given
        usage
    fi
fi
#!/bin/bash
outfile=/root/auth_find-$1

UPDATE (8/31/09): Added “why perl instead of sed” section in response to comment.

Print PDFs as Postscript to an lpr Queue

I wrote a simple script recently for a user who was having trouble getting certain PDFs to print properly from his linux box (Fedora 10). I first suggested that he try converting the pdfs to ps and printing the resulting file. That worked but he found the process a bit tedious. Here’s the script I wrote to take care of the tediousness. It relies on the standard (in Fedora, at least) pdf2ps package. It should be pretty self-explanatory.

#!/bin/bash

# grab first argument as pdf filename and generate ps filename
thePDF=$1
thePS=$(echo $thePDF.ps)
queueName=$2

usage()
{
    echo ""
    echo "Usage: psprint [your pdf] [lpr queue]*"
    echo ""
    echo "This command does three things:"
    echo "  1. Converts the specified pdf file to ps"
    echo "  2. Prints the ps file to your default lpr queue *(unless you specify another queue)"
    echo "  3. Deletes the ps file"
    echo ""
    exit 1
}

if [ $# == 0 ]; then usage; fi
if [ $# == 1 ]; then
    echo "Converting $thePDF ..."
    pdf2ps "$thePDF" "$thePS"
    echo "Sending to default printer ..."
    lpr "$thePS"
    echo "Cleaning up ..."
    rm "$thePS"
    exit 1
fi
if [ $# == 2 ]; then
    echo "Converting $thePDF ..."
    pdf2ps "$thePDF" "$thePS"
    echo "Sending to $2 ..."
    lpr -P "$queueName" "$thePS"
    echo "Cleaning up ..."
    rm "$thePS"
    exit 1
fi
if [ $# > 2 ]; then usage; fi

Save the script to a location in your path (/usr/local/bin works) and you’re off.

Quickly Add a Userset to Many Sun Grid Engine Queues

This will be the first (of many??) posts to spill outside the topics one would think you’d find on a site with the name “Your Mac Guy”. You’ve been warned.

Back in January my primary work responsibilities shifted from Mac servers and desktops (and all that entailed) to Linux servers and desktops and the multitude of new things that entails (at least here where I work). One of the new tasks I’ve picked up is user administration of our Sun Grid Engine (SGE) 500-node cluster. New or existing users who want to submit jobs to the cluster need to be added to custom groups or, in SGE-speak, usersets. We create usersets for each lab, so if the user is part of a lab that doesn’t currently have access to submit jobs, I need to create a new userset and add that userset to each of 16 separate queues.

That last part, adding usersets to queues, is the most tedious part. So tedious, in fact, that it forced my hand into developing a scripted solution. I likely could have found an existing script to accomplish the task for me, but then I wouldn’t have had an excuse to brush up on my 3-years dormant perl skills.

Read the rest of this entry »