I Don’t Need Regular Expressions

Sep 1, 2014 by     11 Comments    Posted under: 3dsMax, DotNet, Maxscript, Tips and Tricks

spag

For a long time there were two things in life that truly scared me and I avoided at all costs. One was the soundtrack from Frozen, the other was Regular Expressions. I came to realise that my avoidance techniques were simply a defence mechanism towards something I didn’t understand.  After trying to understand both of my fears, I have come to the conclusion that I’m right to avoid the soundtrack from Frozen, but Regex is my best friend.

In this article I’ll be discussing Regex and try to de-mystify their usage enough for you to use them in basic scripts in 3dsMax. It took me a long time to realise just how useful they can be, but it is sometimes difficult to get past their verbosity as a programming concept. It is not going to be a tutorial of the more complex aspects of Regex, nor do I consider myself and expert, so I’ll just be looking at some usable examples based on 3 occasions that they have really helped me so that you can use them in your 3D pipeline scripts.

What are they used for?

Put simply, Regular Expressions are a way of searching and manipulating strings. But wait – Maxscript has a more than a few ways of handling strings, so let’s consider first what we have already in order to work out if we actually need another way. Here are a list of the most common string manipulation methods available to maxscript.

MatchPattern --A method to decide if a string is similar to a particular pattern
substituteString -- Swaps one string for another
filterString -- Splits a string into pieces
findString -- Locates the index of a particular string
subString -- Creates a new string based on a portion of another
stricmp -- another method to compare strings, very useful in qSort operations

These are very useful, and I use them all the time. They can pretty much cover most scenarios and they are well documented in the help. So why use a different method? Most occasions of pattern matching in a a string can be handled by matchpattern, but what if you are looking for a particular selection of characters, or capitals, or numeric digits? If you need a little more control over this, then unless you want to use filterstring to split and analyse each part of the string, regex is easier to use.

Let’s consider 3 examples of instances where it might be necessary to manipulate strings in our day to day work.

  1. Validating a filename to see if it conforms to a pipeline structure
  2. Selecting elements from hierarchies with the scene explorer
  3. Parsing a filename argument for a commandline script

Validating a Filename

One thing that needs to happen a lot is checking if a filename conforms to a particular pattern. We have a saving routine that automatically backs up and versions our older files into a separate location. This means there is always one ‘current’ version of a file like a rig or model that anyone can work on.  Let’s say that there are a few pieces of information that we want to be able to check for, like the name of the asset, the type, the artists’s intials who last worked on the file, the major file version, and the minor increment.

So we could have a filename like this:

filename = "Rabbit_RIG_LR_v01_07.max"

If we wanted to ask the question “Is this file in the correct format?” we could try to use matchpattern…

matchpattern filename pattern:"*_*_??_v??_??.max"

It gets us pretty close. Note that you can use the question mark in matchpattern to indicate that there are a certain number of characters, rather than the wildcard pattern * which allows any number of characters.  The problem with this is that we are looking for a specific configuration of numerals and letters, so this would also be a match:

filename = "Rabbit_RIG_lr_vEr_LR.max"
matchpattern filename pattern:"*_*_??_v??_??.max"
true

Ironically, the pattern we have constructed for matchpattern is about as complex as the regex string needed. There’s no regular expressions available to maxscript, so we can use dotnet instead. Here’s how you do it..

rx = dotnetobject "System.Text.RegularExpressions.Regex" <<RegexString>>

If you want to use lots of different regex patterns, you could just use the class and construct them on the fly, but when you have a single pattern that you want to validate, it’s easier to set this up at the start.  The method that regex has to determine that it is a match is the crazily named .isMatch()

So regex is pretty similar, but you can differentiate between letters and numbers. To check for a number, you can use the following

/d

[0-9]

and if you need to check for a certain number of digits, you can use the curly braces

[0-9]{2}

[0-9]{2,4} — Any number of digits between 2 and 4

This is very useful. Checking for words and characters is easy, broadly speaking you can use the fullstop, asterisk or /w. If you wanted to check a render was part of a 4 numeral padded sequence for example, you could do this:

rx = dotnetobject "System.Text.RegularExpressions.Regex" "\w_[0-9]{4}.*"
filename = "Characters_Main_Velocity_v01_0634.jpg"
rx.isMatch filename

So going back to our regex, we need to check for the version, two digits, an underscore, and another two digits. So we are adding this square brace syntax into the string that we need to check for…

-- remember - a matchpattern version of this would be "*_v??_??.max"
"\*_v[0-9]{2}_[0-9]{2}.max"

So the regex to check just the version structure is correct, is this:

filename = "Rabbit_RIG_LR_v01_08.max"
rx = dotnetobject "System.Text.RegularExpressions.Regex" "\*_v[0-9]{2}_[0-9]{2}.max"
rx.isMatch filename

We can extend this to incorporate the two digit intials string too. Using the same square bracket syntax, we can specify that we can two letter digits, but they should also be capitals.

-- [A-Z]{number of letters}
rx = dotnetobject "System.Text.RegularExpressions.Regex" "\*_[A-Z]{2}_v[0-9]{2}_[0-9]{2}.max"
rx.isMatch filename

So this is almost there. We now just need to check there is a name, and the type of asset. We could check for specific instances of words depending on what part of the pipeline we want to identify, but we just need to add the existence of the underscore between them. We also don’t want numbers in the asset type. We can use the [a-zA-Z]+ to indicate that we want any number of letters.

filename = "Rabbit_RIG01_LR_v01_08.max" 
rx = dotnetobject "System.Text.RegularExpressions.Regex" ".*_[a-zA-Z]+_[A-Z]{2}_v[0-9]{2}_[0-9]{2}.max"
rx.isMatch filename
--false
filename = "Rabbit_RIG_LR_v01_08.max"
rx = dotnetobject "System.Text.RegularExpressions.Regex" ".*_[a-zA-Z]+_[A-Z]{2}_v[0-9]{2}_[0-9]{2}.max"
rx.isMatch filename
-- true

So there it is – it looks a little more daunting when you look at it in the complete form, but when you break it down into the component parts, it is actually not as bad. For sure, there will probably be more efficient regex strings to do this kind of thing. In the example for matching only letters, we could have said [^0-9] which, means anything but numbers. Many times you’ll look for examples and people have tried to make the truncation of regex strings an art form within itself. However, In this slightly longer form it means it is easy to understand.

Selecting elements with the scene explorer

One fact about the scene explorer is that you can use regex to apply a custom selection filter. Let’s use what we learned in the previous lesson to apply this into the scene explorer. Firstly, we will need to set the scene explorer to use regex as the search parameter. Press H and select this option.

SE_2

In my test scene, I have run the following script to create 1000 teapots at random locations.

for i = 1 to 1000 do
             teapot name:("Teapot" + (formattedprint ((random 0 9999) as integer) format:"04i") + "_Mesh__MF") pos:(random [0,0,0] [1000,1000,1000])

pots_ahoy

This script also gives them a random index name between zero and 9999.

SE_1

So lets assume a hypothetical situation where we wanted to select all teapots with indices between 2000 and 3999, and 6000 and 7999. How would we do this without manually performing an arduous selection across much of the dialog? Based on what we did before, we can format a regex pattern like this:

Teapot[2-3][0-9]{3}_

However, we need to specify the other range. We can use the (|) notation for this.  So to combine the ranges [2-3][0-9]{3} and [6-7][0-9]{3} we end up with:

Teapot([2-3][0-9]{3}|[6-7][0-9]{3})_

Pasting this string into the Find box automatically selects any object named Teapot within the range we specified before.

se_3

The crowd says teapot selecta!

boselecta

If you want to try something cool based on this, try entering

*.[0-9]*[02468]_

or

*.[0-9]*[13579]_

into the find field. It will select all even or odd numbered teapots in the scene. From what we have learned so far, you can see the pattern of how I’m differentiating between the two types. The uses for this are not just applicable to teapots, I’ve used it for more complex selections in rig hierarchies. But it is good to know it is there.

Parsing a commandline argument

Using commandline applications is going to be covered in another post as it’s too useful to skirt over in just this small section. But I wanted to share an example of a script I use for converting EXR files into half-res Quicktimes for fast and easy file checking. If you have something like Deadline, it is easy to use Draft or a custom python script to do something like this. You can even start Nuke in terminal mode to convert but it might be a little overkill to use a Nuke license for something like this. There is an amazing open source program called DJV that has a commandline option that can be used to do this exact thing. You can get DJV from here:

http://djv.sourceforge.net/

 

The great thing is that it supports Open EXR without having to compile the damn thing yourself. This is very useful for pipeline imaging automation.  What DJV needs is a very specific filename argument. If we want to transcode a file sequence, we have to format the filename into the following format :

Filename_<FirstFrame><Hyphen><LastFrame>.<Extension>

So when writing a routine to automate this, I decided that as Regex and Me were like Riggs and Murtaugh, I’d use it for creating the commandline argument. The full code is below. Take a look and see how I’m using regex to split and concatenate the file array into a single line ready to process. I’ll step through this script in a later post if you can’t get the gist of it, but its not really important if you just want to have an open source and free method of converting renders. The cool thing is you can use a post render script or deadline task to automatically generate this, meaning you can arrive in the morning and make a fast preliminary check of renders without having to load large EXR sequences, or needing a 3rd party player like PD or RV.

global djv

Struct djvOps
(
-- i've declared these in the struct, you could do it at instantiation also.
x86Path = @"C:\CMD_useful\DJV\x86\djv_convert.exe",
x64Path = @"C:\CMD_useful\DJV\x64\djv_convert.exe",
input = undefined,
resolution = 0.5,
isExr = false,
openOutput = true,

-- includes gamma correction in the argument string for OpenEXR
fn cmdArgs inStr outStr res =
(
if djv.isExr then
(inStr + " " + outStr + @" -load OpenEXR Gamma 1.0 -scale " + (res as string) )
else
(inStr + " " + outStr + @" -scale " + (res as string) )
),

-- function djv_parseImageSequenceToInputString
-- We Need a string to pass to the DJV commandline
-- this is a combination of the first and last frames, with a hyphen between them
-- an example of how DJV needs a string output is "C:\Users\LoneRobot\Documents\3dsMax\renderoutput\sq_PassTest\T_0000-0100.exr"
-- This works only with padded sequences between 4 and 5 digits
fn parseImageSequenceToInputString =
(
if djv.input != undefined and doesfileexist djv.input then
(
local seq
if classof djv.input == string then
(
-- Regex replace the filename padding and change for wildcard to get all of the image sequence
rxPadding = dotnetobject "System.Text.RegularExpressions.Regex" "[0-9]{4,5}"
wcPattern = rxPadding.replace djv.input "*" 1
--format "wcPattern - %\n" wcPattern
if wcPattern as name != djv.input as Name then seq = sort ( getfiles wcPattern )
)
else seq = image

if seq.count > 0 then
(
djv.isExr = getFilenameType (amin seq) == ".exr"
head = getfilenameFile (amin seq)
tail = getfilenameFile (amax seq)
-- you can reuse the same regex pattern, this time extracting the padding
h1 = (rxPadding.match head).groups.item[0].value
t1 = (rxPadding.match tail).groups.item[0].value

if h1 as integer != undefined and t1 as integer != undefined then
rxPadding.replace djv.input (h1 + "-" + t1) 1
else djv.input
)
)
),
-- default resolution parameter is half res
fn createQuickTimeFromEXR res:djv.resolution =
(
-- Add the path to the DJV executable
-- Transcoding to quicktime would need the x86 version,
-- Image Transcoding can be used with either the x64 or x86 versions
-- These paths need to reflect where you installed DJV to
if doesfileexist djv.X86Path then
(
-- pass the first frame in the sequence to the regex function
-- this is a 100 frame sequence
theFile = parseImageSequenceToInputString()
--theFile = "C:\Users\LoneRobot\Documents\3dsMax\renderoutput\sq_PassTest\T_0000-0100.exr"
-- insert theFile into the CMD argument stream
if theFile != undefined then
(
--setup the file output
filePath = getfilenamepath theFile
fileName = getfilenamefile theFile
outDir = pathconfig.appendpath filePath "_preview"
if not doesfileexist outDir then makedir outDir
theOutPut = pathconfig.appendpath outDir (fileName + ".mov")
-- we are going to create a half res quicktime
args = djv.cmdArgs thefile theOutput res
--theFile + " " + theOutPut + @" -load OpenEXR Gamma 1.0 -scale " + (res as string)
-- create the commandline process
proc = dotnetobject "system.diagnostics.process"
-- make sure this is the executable you want to use
proc.StartInfo.FileName = djvX86Path
proc.StartInfo.Arguments = args
proc.StartInfo.RedirectStandardOutput = true
-- needs UseShellExecute set to false in order to redirect IO streams
proc.StartInfo.UseShellExecute = false
proc.StartInfo.CreateNoWindow = true
format "**************************\nCreating Quicktime from EXR sequence\n"
proc.Start()
-- make sure we don't freeze
windows.processPostedMessages()
reader = proc.StandardOutput
while (l = reader.readline()) != undefined do
(
format "%\n" l
windows.processPostedMessages()
)
if djv.openOutput then shelllaunch "explorer.exe" ("/e," + outDir)
)
else messagebox ("There was a problem with the image you supplied.\n\n" + djv.input + "\nFile Exists? : " + (if doesfileexist djv.input then "Yes" else "No")) title:"ooops" beep:false
)
else messagebox ("You don't appear to have DJV installed in the location specified.\n\n" + djvX86Path) title:"ooops" beep:false
)
)-- end struct

-- example usage
djv = djvOps()
-- set the first frame of the exr sequence to the struct input
djv.input=@"C:\Users\LoneRobot\Documents\3dsMax\renderoutput\sq_PassTest\T_0000.exr"
djv.createQuickTimeFromEXR()

So that concludes things for now. Please check back soon for new posts. Yes I know I’ve been rubbish. Thanks to Rotem Schiffman for his maxscript syntax brush, it makes these posts much more readable.

404