PowerShell_unix PowerShell is definitely gaining momentum in the windows scripting world but I still hear folks wanting to rely on Unix based tools to get their job done.  In this series of posts I’m going to look at converting some of the more popular Unix based tools to PowerShell.

cut

The Unix “cut” command is used to extract sections from each link of input.  Extraction of line segments can be done by bytes, characters, or fields separated by a delimiter.  A range must be provided in each case which consists of one of N, N-M, N- (N to the end of the line), or –M (beginning of the line to M), where N and M are counted from 1 (there is no zeroth value).

For PowerShell, I’ve omitted support for bytes but the rest of the features is included.  The Parse-Range function is used to parse the above range specification.  It takes as input a range specifier and returns an array of indices that the range contains.  Then, the In-Range function is used to determine if a given index is included in the parsed range. 

The real work is done in the Do-Cut function.  In there, input error conditions are checked.  Then for each file supplied, lines are extracted and processed with the given input specifiers.  For character ranges, each character is processed and if it’s index in the line is in the given range, it is appended to the output line.  For field ranges, the line is split into tokens using the delimiter specifier (default is a TAB).  Each field is processed and if it’s index is in the included range, the field is appended to the output with the given output_delimiter specifier (which defaults to the input delimiter).

The options to the Unix cut command are implemented with the following PowerShell arguments:

Unix PowerShell Description
FILE -filespec The files to process.
-c -characters Output only this range of characters.
-f -fields Output only these fields specified by given range.
-d -delimiter Use DELIM instead of TAB for input field delimiter.
-s -only_delimited Do not print lines not containing delimiters.
--output-delimiter -output_delimiter Use STRING as the output deflimiter.