Breaking Down the Monster III

So, finishing this off.

It-sa bunch-a case lines!

Write first:

 

echo $1 $2 "filesize: "$3 "totalsize: "$4"G" "filesperdir: "$5
case $1 in
	write)
        if [ $2 = scality ]; then
            filecount=$totfilecount
            time scalitywrite
            exit 0
        fi
        

So if it’s a Scality (or other pure object storage), it’s simple. Just run the write and time it, which will output the info you need. OTHERWISE…

#Chunk file groups into folders if count is too high
	if [ $totfilecount -ge 10000 ]; then
	    for dir in `seq 1 $foldercount`; do
	        createdir $fspath/$dir
	    done
	    time for dir in `seq 1 $foldercount`; do
	        path=$fspath/$dir
		filecount=$(( $totfilecount / $foldercount ))
	        writefiles
	    done
	else
	    path=$fspath
            createdir $path
            filecount=$totfilecount
            time writefiles
	fi
	;;

 

Do what the comment says. Chunk the files into folders, since if you write to a filesystem, count of files in directories makes a big difference. . Make sure you create the directories before you try to write to them… and then time how long it takes to write all of them. If it’s less than the critical file count number, then just write them and time it.

Neeeext….

 

read) #in order read
	sync; echo 1 > /proc/sys/vm/drop_caches
        if [ $2 = scality ]; then
            filecount=$totfilecount
            time scalityread
            exit 0
        fi
	if [ $totfilecount -ge 10000 ]; then
		time for dir in `seq 1 $foldercount`; do
			path=$fspath/$dir
			filecount=$(( $totfilecount / $foldercount ))
			readfiles
		done
	else
		path=$fspath
		filecount=$totfilecount
		time readfiles
	fi
	;;

That sync line is how you clear the filesystem cache (as root) on a Linux system. This is important for benchmarking, because let me tell you, 6.4GB/sec is not a speed that most network storage systems can reach. Again, we split it and time all of the reads, or we just straight up time the reads if the file count is low enough. This routine reads files in the order they were written.

 

	rm) #serial remove files
        if [ $2 = scality ]; then
            time for i in `seq 1 $totfilecount`; do
                curl -s -X DELETE http://localhost:81/proxy/bparc/$fspath/$i-$suffix > /dev/null
            done
            exit 0
        fi
		if [ $totfilecount -ge 10000 ]; then
			time for i in `seq 1 $foldercount`; do
				rm -f $fspath/$i/*-$suffix
				rmdir $fspath/$i
			done
		elif [ -d $fspath/$3 ]; then 
			time rm -f $fspath/*-$suffix
		fi
	;;

Similar to the other two routines, if it’s an object based, do something completely different, otherwise remove based on file path and count of files.

 

	parrm) #parallel remove files
		time ls $fspath | parallel -N 64 rm -rf $fspath/{}
	;;

This one is remarkably simple. Just run parallel against an ls of the top level directory, and pipe it into rm -rf. The {} is stdin for parallel. The -N 64 is number of threads to run.

 

This one’s kind of neat:

	shufread) #shuffled read
		sync; echo 1 > /proc/sys/vm/drop_caches
		if [ $totfilecount -ge 10000 ]; then
			folderarray=(`shuf -i 1-$foldercount`)
			time for dir in ${folderarray[*]}; do
				path=$fspath/$dir
				filecount=$(( $totfilecount / $foldercount ))
				shufreadfiles
			done
		else
			path=$fspath
			filecount=$totfilecount
			time shufreadfiles
		fi
	;;
	

I needed a way to do random reads over the files I’d written, in order to simulate that on filesystems with little caching (ie, make the drives do a lot of random seeks.)

	shufread) #shuffled read
		sync; echo 1 > /proc/sys/vm/drop_caches
		if [ $totfilecount -ge 10000 ]; then
			folderarray=(`shuf -i 1-$foldercount`)
			time for dir in ${folderarray[*]}; do
				path=$fspath/$dir
				filecount=$(( $totfilecount / $foldercount ))
				shufreadfiles
			done
		else
			path=$fspath
			filecount=$totfilecount
			time shufreadfiles
		fi
	;;
	

At first, I tried writing the file paths to a file, then reading that, but that has waaaay too much latency when you’re doing performance testing. So, after some digging, I found the shuf command, which shuffles a list. You can pass an arbitrary list with the -i flag. I tossed this all into an array, and then it proceeds like the read section.

 

	*) usage && exit 1;;
esac
echo '------------------------'

Fairly self explanatory. I tossed an echo with some characters in to keep the output clean if you’re running the command inside a for loop.

And that’s it!

Advertisements

Breaking down the monster, part the second

Let’s do this thing.

This little snippet tells the bash builtin time to output in realtime seconds to one decimal place (like the comment says). Very handy modifier, and of course there are others.

#defines output of time in realtime seconds to one decimal place
TIMEFORMAT=%1R

Here's a function to create directories. Seems kind of silly, but if you don't check if the directory is created first, mkdir throws an error, and who wants that?
#creates directory to write to
createdir () {
	if [ ! -d $1 ]; then
		mkdir -p $1
	fi
}

Now we get to the meat of the thing. There’s 5 functions to do the writes, reads, and shuffled reads. But wait, that’s only 3 operations, you say! Well, object is different. So there are two separate functions for writing and reading from the object store. Obviously these would need to be heavily modified for any object based storage that’s not Scality, since we’re using their sproxyd method at this time. I haven’t looked into what Swift or S3 looks like, but it’s probably fairly similar, syntax wise.

So, writes:

#write test
writefiles () {
	#echo WRITE
	for i in `seq 1 $filecount`; do 
		#echo -n .
		dd if=/dev/zero of=$path/$i-$suffix bs=$blocksize count=$blockcount 2> /dev/null
	done
}

Pretty straightforward, here. All the variables, aside from i, are globally defined outside the function. The commented out echo -n . just creates some pretty …….. stuff, if you’re into that sort of thing.

Same with reads:

#read test
readfiles () {
	#echo READ
	for i in `seq 1 $filecount`; do 
		#echo -n .
		dd if=$path/$i-$suffix of=/dev/null bs=$blocksize 2> /dev/null
		#dd if=$path/$i-$suffix of=/dev/null bs=$blocksize
	done
}

Now the shuffled writes. I played with a few ways of doing this (like writing out the list to a file, then reading that file), but writing to a file and reading from it is expensive, which gives bum results on the timing. So, an array it is, created by shuffling (shuf -i 1-$filecount) a list of numbers. There’s still some debugging code commented out in here.

#shuffled read test
shufreadfiles () {
	#echo SHUFFLE READ
	filearray=(`shuf -i 1-$filecount`)
	for i in ${filearray[*]}; do 
		#echo -n .
		#echo $path/$i-$suffix
		dd if=$path/$i-$suffix of=/dev/null bs=$blocksize 2> /dev/null
		#dd if=$path/$i-$suffix of=/dev/null bs=$blocksize
	done
}

Now we get to the object stuff. I wanted to eliminate the need for reading from a file to do the writes, so I jiggered up curl to take stdin from dd. This is done with the -T- flag. It doesn’t work in some circumstances (which I will detail in a later post when I talk about Isilon RAN object access), but it does here with plain ‘ol unencrypted http calls.

#ObjectWrite
scalitywrite () {
    for i in `seq 1 $filecount`; do
        dd if=/dev/zero bs=$blocksize count=$blockcount 2> /dev/null | curl -s -X PUT ;
         http://localhost:81/proxy/bparc$fspath/$i-$suffix -T- > /dev/null
    done
}

So there’s a lot of mess in here where I was trying to get rid of output from curl and dd. This is fairly difficult. dd outputs to stderr as a normal program does, but curl need to be silenced (-s) as well as sending its stdout to /dev/null. The other note is that Scality sproxyd can be written to in two ways; with a hash that contains some metadata about how to protect the files and where to put them, or as a path. The path is hashed by the system, and the object is written that way. Note that you CAN’T do a file list, and it’s not stored in the system by the path. The full path can be retrieved, but not searched for.

The read is much simpler

#ObjectRead
scalityread () {
    for i in `seq 1 $filecount`; do
        curl -s -X GET http://localhost:81/proxy/bparc/$fspath/$i-$suffix > /dev/null
    done
}

OK, in the next post, I’ll get to the heart of the script, where it calls all of these here functions.

Breaking down that monster

Or should I use Beast? No, this isn’t an XtremIO. (sorry, I just got back from EMCWorld 2015. The marketing gobbledygook is still strong in me.)

So, first part of the script, like many others, is a function (cleverly called usage), followed by the snippet that calls the function:


usage () {
	echo "Command syntax is $(basename $0) [write|read|shufread|rm|parrm] [test|tier1|tier2|gpfs|localscratch|localssd|object]"
        echo "[filesizeG|M|K] [totalsize in GB] (optional) [file count per directory] (optional)"
}

if [ "$#" -lt 3 ]; then
	usage
	exit 1
fi

Not much to see here if you already know what functions are and how they’re formatted in bash. Basically, if it starts with () { and then is closed with }, it’s a function, and you can call it like a script inside the main script. The code is not executed until it is called by name. You can even pass it input variables–more on that later.

Next, we come to a case block:


case $2 in
	test) fspath=/mnt/dmtest/scicomp/scicompsys/ddcompare/$3 ;;
	tier1) fspath=/mnt/node-64-dm11/ddcompare/$3 ;;
	tier2) fspath=/mnt/node-64-tier2/ddcompare/$3 ;;
	gpfs) fspath=/gpfs1/nlsata/ddcompare/$3 ;;
        localscratch) fspath=/scratch/carlilek/ddcompare/$3 ;;
        localssd) fspath=/ssd/ddcompare/$3 ;;
        object) fspath=/srttest/ddcompare/$3 ;;
	*) usage && exit 1;;
esac

This checks the second variable and sets the base path to be used in the testing. Note that object will be used differently than the rest, because all of the rest are file storage paths. Object ain’t.

Then, we set the size of the files (or objects) to be written, read, or deleted:


case $3 in
	*G) filesize=$(( 1024 * 1024 * `echo $3 | tr -d G`));;
	*M) filesize=$(( 1024 * `echo $3 | tr -d M` ));;
	*K) filesize=`echo $3 | tr -d K`;;
	*) usage && exit 1;;
esac

Note that I should probably be using the newer call out to command style of $( ) here, rather than backticks. I’ll get around to it at some point.

The bizarre $(( blah op blah )) setup is how you do math in bash. Really.

The next few bits are all prepping how many files to write to a given subdirectory, how big the files are, etc.


#set the suffix for file names
suffix=$3

#set the total size of the test set
if [ ! -z $4 ]; then
	totalsize=$(( 1024 * 1024 * $4 ))
else
	totalsize=52428800 #The size of the test set in kb
fi
	
#set the number of files in subdirectories
if [ ! -z $5 ]; then
	filesperdir=$5
else
	filesperdir=5120 #Number of subdirs to use for large file counts
fi

#set up variables for dd commands
if [ $filesize -ge 1024 ]; then
	blocksize=1048576
else
	blocksize=$(( $filesize * 1024 ))
fi

#set up variables for subdirectories
totfilecount=$(( $totalsize / $filesize ))
blockcount=$(( $filesize * 1024 / $blocksize ))
if [ $filesperdir -le $totfilecount ]; then
	foldercount=$(( $totfilecount / $filesperdir ))
fi

OK, I’ll get into the meat of the code in my next post. But I’m done now.

The first of several benchmarking scripts

I’m currently a file storage administrator, specializing in EMC Isilon. We have a rather large install (~60 heterogeneous nodes, ~4PB) as well as some smaller systems, an HPC dedicated GPFS filer from DDN, and an object based storage system from Scality. Obviously, all of these things have different performance characteristics, including the differing tiers of Isilon.

I’ve been benchmarking the various systems using the script below. I’ll walk through the various parts of the script. To date, this is probably one of my more ambitious attempts with Bash, and it would probably work better in Python, but I haven’t learned that yet. 😉


#!/bin/bash
usage () {
	echo "Command syntax is $(basename $0) [write|read|shufread|rm|parrm] [test|tier1|tier2|gpfs|localscratch|localssd|object]"
        echo "[filesizeG|M|K] [totalsize in GB] (optional) [file count per directory] (optional)"
}

if [ "$#" -lt 3 ]; then
	usage
	exit 1
fi

#CHANGE THESE PATHS TO FIT YOUR ENVIRONMENT
#set paths
case $2 in
	test) fspath=/mnt/dmtest/scicomp/scicompsys/ddcompare/$3 ;;
	tier1) fspath=/mnt/node-64-dm11/ddcompare/$3 ;;
	tier2) fspath=/mnt/node-64-tier2/ddcompare/$3 ;;
	gpfs) fspath=/gpfs1/nlsata/ddcompare/$3 ;;
        localscratch) fspath=/scratch/carlilek/ddcompare/$3 ;;
        localssd) fspath=/ssd/ddcompare/$3 ;;
        object) fspath=/srttest/ddcompare/$3 ;;
	*) usage && exit 1;;
esac

#some math to get the filesize in kilobytes
case $3 in
	*G) filesize=$(( 1024 * 1024 * `echo $3 | tr -d G`));;
	*M) filesize=$(( 1024 * `echo $3 | tr -d M` ));;
	*K) filesize=`echo $3 | tr -d K`;;
	*) usage && exit 1;;
esac	

#set the suffix for file names
suffix=$3

#set the total size of the test set
if [ ! -z $4 ]; then
	totalsize=$(( 1024 * 1024 * $4 ))
else
	totalsize=52428800 #The size of the test set in kb
fi
	
#set the number of files in subdirectories
if [ ! -z $5 ]; then
	filesperdir=$5
else
	filesperdir=5120 #Number of subdirs to use for large file counts
fi

#set up variables for dd commands
if [ $filesize -ge 1024 ]; then
	blocksize=1048576
else
	blocksize=$(( $filesize * 1024 ))
fi

#set up variables for subdirectories
totfilecount=$(( $totalsize / $filesize ))
blockcount=$(( $filesize * 1024 / $blocksize ))
if [ $filesperdir -le $totfilecount ]; then
	foldercount=$(( $totfilecount / $filesperdir ))
fi

#debug output
#echo $fspath
#echo filecount $totfilecount
#echo totalsize $totalsize KB
#echo filesize $filesize KB
#echo blockcount $blockcount
#echo blocksize $blocksize bytes

#defines output of time in realtime seconds to one decimal place
TIMEFORMAT=%1R

#creates directory to write to
createdir () {
	if [ ! -d $1 ]; then
		mkdir -p $1
	fi
}

#write test
writefiles () {
	#echo WRITE
	for i in `seq 1 $filecount`; do 
		#echo -n .
		dd if=/dev/zero of=$path/$i-$suffix bs=$blocksize count=$blockcount 2> /dev/null
	done
}

#read test
readfiles () {
	#echo READ
	for i in `seq 1 $filecount`; do 
		#echo -n .
		dd if=$path/$i-$suffix of=/dev/null bs=$blocksize 2> /dev/null
		#dd if=$path/$i-$suffix of=/dev/null bs=$blocksize
	done
}

#shuffled read test
shufreadfiles () {
	#echo SHUFFLE READ
	filearray=(`shuf -i 1-$filecount`)
	for i in ${filearray[*]}; do 
		#echo -n .
		#echo $path/$i-$suffix
		dd if=$path/$i-$suffix of=/dev/null bs=$blocksize 2> /dev/null
		#dd if=$path/$i-$suffix of=/dev/null bs=$blocksize
	done
}

#ObjectWrite
scalitywrite () {
    for i in `seq 1 $filecount`; do
        dd if=/dev/zero bs=$blocksize count=$blockcount 2> /dev/null | curl -s -X PUT http://localhost:81/proxy/bparc$fspath/$i-$suffix -T- > /dev/null
    done
}

#ObjectRead
scalityread () {
    for i in `seq 1 $filecount`; do
        curl -s -X GET http://localhost:81/proxy/bparc/$fspath/$i-$suffix > /dev/null
    done
}

#Do the work based on the work type

echo $1 $2 "filesize: "$3 "totalsize: "$4"G" "filesperdir: "$5
case $1 in
	write) 
        if [ $2 = scality ]; then
            filecount=$totfilecount
            time scalitywrite
            exit 0
        fi
        #Chunk file groups into folders if count is too high
	    if [ $totfilecount -ge 10000 ]; then
			for dir in `seq 1 $foldercount`; do
				createdir $fspath/$dir
			done
			time for dir in `seq 1 $foldercount`; do
				path=$fspath/$dir
				filecount=$(( $totfilecount / $foldercount ))
				writefiles
			done
		else
			path=$fspath
            createdir $path
			filecount=$totfilecount
			time writefiles
		fi
	;;
	read) #in order read
		sync; echo 1 > /proc/sys/vm/drop_caches
        if [ $2 = scality ]; then
            filecount=$totfilecount
            time scalityread
            exit 0
        fi
		if [ $totfilecount -ge 10000 ]; then
			time for dir in `seq 1 $foldercount`; do
				path=$fspath/$dir
				filecount=$(( $totfilecount / $foldercount ))
				readfiles
			done
		else
			path=$fspath
			filecount=$totfilecount
			time readfiles
		fi
	;;
	rm) #serial remove files
        if [ $2 = scality ]; then
            time for i in `seq 1 $totfilecount`; do
                curl -s -X DELETE http://localhost:81/proxy/bparc/$fspath/$i-$suffix > /dev/null
            done
            exit 0
        fi
		if [ $totfilecount -ge 10000 ]; then
			time for i in `seq 1 $foldercount`; do
				rm -f $fspath/$i/*-$suffix
				rmdir $fspath/$i
			done
		elif [ -d $fspath/$3 ]; then 
			time rm -f $fspath/*-$suffix
		fi
	;;
	parrm) #parallel remove files
		time ls $fspath | parallel -N 64 rm -rf $fspath/{}
	;;
	shufread) #shuffled read
		sync; echo 1 > /proc/sys/vm/drop_caches
		if [ $totfilecount -ge 10000 ]; then
			folderarray=(`shuf -i 1-$foldercount`)
			time for dir in ${folderarray[*]}; do
				path=$fspath/$dir
				filecount=$(( $totfilecount / $foldercount ))
				shufreadfiles
			done
		else
			path=$fspath
			filecount=$totfilecount
			time shufreadfiles
		fi
	;;
		
	*) usage && exit 1;;
esac
echo '------------------------'

I’ll break this all down in my next post.

Simple script for restarting the CELOG on Isilon

If your Isilon cluster has its CELOG fill up to the point where it no longer sends you email alerts (and/or smtp traps) and you can’t clear it yourself, even with the CLI, you’ll probably need this script. It’s a compilation of what support told me several times that I got tired of looking up in my old emails.


#!/bin/bash
isi services -a celog_coalescer disable
isi services -a celog_monitor disable
isi services -a celog_notification disable
sleep 120
isi_for_array killall isi_mcp
isi_for_array pkill isi_celog_
sleep 60
isi_for_array rm -rf /var/db/celog/*
isi_for_array rm -rf /var/db/celog_master/*
rm -rf /ifs/.ifsvar/db/celog/*
isi_for_array isi_mcp
sleep 30
isi services -a celog_coalescer enable
sleep 30
isi services -a celog_monitor enable
sleep 30
isi services -a celog_notification enable
sleep 30
isi services -a celog_coalescer enable
isi services -a celog_monitor enable
isi services -a celog_notification enable

Nothing special here, but perhaps it will come in handy for someone. I have heard that they are aware of the bug and it will be fixed in a future release of OneFS.

Hideous powershell and bash scripts for comparing groups

I apparently started writing this post many moons ago. I have no idea what I was doing, but maybe someone will find it useful.

$Groups = Get-ADGroup -Properties * -Filter * -SearchBase 
"OU=Groups,OU=SciComp,DC=hhmi,DC=org" 
Foreach($G In $Groups)
{
 New-Item U:\Documents\adgroups\$G.Name -type file
 Add-Content U:\Documents\adgroups\$G.Name $G.Members
}
for i in `ls`; do mv $i `echo $i | awk -F, '{print $1}' | awk -F= '{print $2}'`; done

for i in `ls`; do cat $i | awk -F, ‘{print $1}’ | awk -F= ‘{print $2}’ > mod/$i; done

mv mov/* .

for i in `ls adgroups`; do /root/grouptest.sh $i 2>/dev/null | sort > ldapgroups/$i; done