> Schema > Products > SRTransport > Documentation > Using DataScript
|
Using DataScript
DataScript is used to manipulate data in the expressions part of a transport and in pre and post processing. (See Transforming Data). It is not intended to be a complete programming language. For quick reference see DataScript Reference Value ExpressionsIn the transport window the Source and Target Columns actually represent an object called a tuple. A tuple is like a dictionary or a hash that holds key, value pairs. Each key (variable name) is associated with a variable in the tuple. Expressions often operate on the variables in the source tuple and assign the results to variables in the destination tuple. A 'Value Expression' field is associated with the 'Target Column' field in the same row. The simplest and most common operation is a direct assignment from source to destination. Put the name of the source variable in the expressions row corresponding to the destination variable. The values assigned to the Destination tuple are ultimately sent to the database or output file. In the expressions column you assign values to the destination variables. They can be values from a source column with or without modifications or they can be derived in some other way.
The most important functions are available by selecting 'Functions' from the Transport menu. To use one, select a target column name and select the desired function from the menu. The function will appear in the 'Value Expression' for the selected column. Fill in the required values in the parenthesis. (See the function explanations under 'Pre and Post Processing' for more information on what parameters and return types functions take.) Pre and Post ProcessingIn the center of the main 'Transport' window is a button marked with an ellipsis. If you click on it the Transport Customization window will be displayed. This window has a pop-up menu that contains a number of places to change the way SRTransport handles the rows as they move through from source to destination. If you are unfamiliar with the sequence of events read SRTransport Processing Architecture. What concerns us here are the four pre and post processing areas. In pre and post processing, the language is DataScript. You can set a variable to a constant value, to some other variable value or to the result of a function call that returns a value. (Procedures that do not return a value are used with the keyword call). You can try these examples in the Interactive DataScript window ( select 'Interactive DataScript' from the Transport menu). Working With VariablesThere are 4 types of variables that are important for processing - integer, double, string and text. There are several other types that SRTransport uses that are not needed for processing but you might see mentioned. They include 'tuple', and 'list'. As explained above a tuple is basically a dictionary, and a 'list' is basically an array. All variables are global and affect SRTransport's processing space.
The use of declare, set, print and remove when working with variables.
declare i integer set i = 100 print i remove i Variables may be initialized with a constant when they are declared.
declare j integer = 100, d double set d = 99.99 print = j + d remove j, d If you need a variable for processing rows, declare it in Run preprocessing and remove it in Run postprocessing. Manipulating StringsStrings, like all other variables must be declared. You can assign a constant value to a string in the declaration, or use set. Constant strings must appear in quotes.They may be single or double quoted. If a string needs to contain one use the other to quote it with. declare aString string = "some text" print aString set aString = 'some "other" text' declare anotherString string set anotherString = aString If you must have both kinds of quotes in a string, double quote the string and escape double quotes with another double quote. set aString = """A very 'confused' quote"", he said." Strings can be concatenated using the '+' operator. set aString = aString + " with yet more text added" print aString There are a number of utility functions for working with strings. To remove white space (spaces, tabs, CR's) from either or both ends of a string use trim(string) (both sides), ltrim(string) (left side) or rtrim(string) (right side).
set aString = " empty spaces on either end "
set aString = trim(aString)
or
set aString = ltrim(aString)
or
set aString = rtrim(aString)
To remove characters from a string use strip(string, 'characters'). set aString = '"text that includes quotes and $dollar signs"' set aString = strip(aString, '"$') You can set strings to all upper or lower case with upper(string) and lower(string). set aString = 'lowercase text' set aString = upper(aString) set aString = lower(aString) You can find out how long a string is with length(string).
set aString = "This long string contains a 'short string'" declare shortString string = "short string" set i = position(shortString, aString) declare anotherString string = substr(aString, i, length(shortString)) print anotherString Convert an expression to a string with convert(type, expression). set aString = convert(string, i*12) Substitutions can be made with three different substitute functions. substitute1(key-string, substitute-column, key-column, table)
substitute2 (key-string, default-string {,value, substitute}+)
substitute3 (key-string, filename)
That's it for strings - Don't forget to clean up. remove aString, shortString, anotherString, i, Working with the 'text' type'text' differs from 'string' in that it can contain binary data. It can also contain ascii characters (and therefore be printable). 'text' can also be longer( megabytes - gigabytes ) whereas a 'string' is limited to 1000 characters. Some functions return 'text' types. Blobs are 'text'. To assign data to a 'text' variable use the print to keywords. declare someText text print to someText "some very interesting text" print someText remove someText Working With FilesDuring processing you can read from and write to files.
declare F text redirect F to file "/tmp/photo.data" print to F photo reset F remove F Going in the opposite direction - Suppose you had a file, 'paths.txt', containing the file path names for blobs you wanted to insert into a table. Use 'paths.txt' as a source and configure one field called 'path' with a type 'string'. In 'Run preprocessing' declare a text variable - declare bigdata text In 'Row preprocessing' set bigdata's value to the contents of the file in the variable 'path'- set bigdata = filecontents(path) then in 'Value Expression' you can use 'bigdata' as the value of the destination blob. Important Functionsexecsql(sqlString) is a powerful tool in pre- and post- processing.
In 'Run preprocessing' declare some variables - declare sql string, c_id integer If in 'Row preprocessing' one wrote something like this -
(one line)
set sql = 'SELECT customer_id FROM aTable
WHERE name = "' + lname + ', ' + fname + '"'
set c_id = execsql(sql)
You can now use c_id in the Value Expressions column In 'Run postprocessing' clean up. remove sql, c_id sys(cmdString) (not available on "Windows") The command is executed (by /bin/sh) and the stdout is returned as a 'text' type (can be converted to a string). This allows you to use many unix functions for processing source data. You can execute shell scripts. For example. Suppose you have a shell script called /usr/home/myscript.sh that takes a string, myString, as an argument and returns some output that you want to assign to a string variable. declare result string, cmd string set cmd = "sh /usr/home/myscript " + myString set result = convert( string, sys(cmd) ) Other Functionsisnull(a,b) If a is null, substitute b.
Control StatementsDataScript has a number of control structures that are similar to C. if - else if - else
Notice that parenthesis are not required around the condition, that brackets are only required when there is more than one statement as part of the block and that you use '=' for equality (not '==' as you would in C, C++ or Java). Break causes an exit from the while loop. Continue causes the next iteration of the loop to begin, bypassing any subsequent statements in the loop.
while [some condition]
{
if 1 < 2
print "one is less than two"
else if 1 = 2
{
print "there is something wrong with this math"
continue # goes to the top of the loop
}
else # 1 > 2
{
print "No way!"
break # exits the loop
}
print "end of loop" # prints only when 1 < 2
}
DataScript does not have a for loop but you can use a while loop to accomplish the same thing.
declare j integer = 100
declare i integer = 0
while i < j {
// do stuff
set i = i+1
}
remove i, j
Writing Your Own FunctionsFunction declarations use the keyword 'procedure'. To return a value, use the return statement. If you don't explicitly return a value, then NULL is returned implicitly. Here is a trivial example that shows the basic syntax.
procedure MyAdder ( a integer, b integer )
{
declare answer integer
set answer = a + b
return answer
}
declare i integer
set i = MyAdder( 1, 2 )
print i
See Also |