textutil - Procedures to manipulate texts and strings.
package require Tcl 8.2
package require textutil ?0.7.1?
::textutil::adjust string args
::textutil::adjust::readPatterns filename
::textutil::adjust::listPredefined
::textutil::adjust::getPredefined filename
::textutil::indent string prefix
?skip?
::textutil::undent string
::textutil::splitn string ?len?
::textutil::splitx string ?regexp?
::textutil::tabify string ?num?
::textutil::tabify2 string ?num?
::textutil::trim string ?regexp?
::textutil::trimleft string ?regexp?
::textutil::trimright string ?regexp?
::textutil::trimPrefix string prefix
::textutil::trimEmptyHeading string
::textutil::untabify string ?num?
::textutil::untabify2 string ?num?
::textutil::strRepeat text num
::textutil::blank num
::textutil::chop string
::textutil::tail string
::textutil::cap string
::textutil::uncap string
::textutil::longestCommonPrefixList list
::textutil::longestCommonPrefix ?string...?
The package textutil provides commands that manipulate
strings or texts (a.k.a. long strings or string with embedded newlines or
paragraphs). It is actually a bundle providing the commands of the six
packages
- textutil::adjust
- textutil::repeat
- textutil::split
- textutil::string
- textutil::tabify
- textutil::trim
in the namespace textutil.
The bundle is deprecated, and it will be removed in a
future release of Tcllib, after the next release. It is recommended to use
the relevant sub packages instead for whatever functionality is needed by
the using package or application.
The complete set of procedures is described below.
- ::textutil::adjust string args
- Do a justification on the string according to args. The
string is taken as one big paragraph, ignoring any newlines. Then the line
is formatted according to the options used, and the command return a new
string with enough lines to contain all the printable chars in the input
string. A line is a set of chars between the beginning of the string and a
newline, or between 2 newlines, or between a newline and the end of the
string. If the input string is small enough, the returned string won't
contain any newlines.
Together with ::textutil::indent it is possible to
create properly wrapped paragraphs with arbitrary indentations.
By default, any occurrence of spaces characters or tabulation
are replaced by a single space so each word in a line is separated from
the next one by exactly one space char, and this forms a real
line. Each real line is placed in a logical line, which
have exactly a given length (see -length option below). The
real line may have a lesser length. Again by default, any
trailing spaces are ignored before returning the string (see
-full option below). The following options may be used after the
string parameter, and change the way the command place a
real line in a logical line.
- -full boolean
- If set to false, any trailing space chars are deleted before
returning the string. If set to true, any trailing space chars are
left in the string. Default to false.
- -hyphenate
boolean
- if set to false, no hyphenation will be done. If set to
true, the last word of a line is tried to be hyphenated. Defaults
to false. Note: hyphenation patterns must be loaded prior, using
the command ::textutil::adjust::readPatterns.
- -justify
center|left|plain|right
- Set the justification of the returned string to center,
left, plain or right. By default, it is set to
left. The justification means that any line in the returned string
but the last one is build according to the value. If the justification is
set to plain and the number of printable chars in the last line is
less than 90% of the length of a line (see -length), then this line
is justified with the left value, avoiding the expansion of this
line when it is too small. The meaning of each value is:
- center
- The real line is centered in the logical line. If needed, a set of space
characters are added at the beginning (half of the needed set) and at the
end (half of the needed set) of the line if required (see the option
-full).
- left
- The real line is set on the left of the logical line. It means that there
are no space chars at the beginning of this line. If required, all needed
space chars are added at the end of the line (see the option
-full).
- plain
- The real line is exactly set in the logical line. It means that there are
no leading or trailing space chars. All the needed space chars are added
in the real line, between 2 (or more) words.
- right
- The real line is set on the right of the logical line. It means that there
are no space chars at the end of this line, and there may be some space
chars at the beginning, despite of the -full option.
- -length
integer
- Set the length of the logical line in the string to integer.
integer must be a positive integer value. Defaults to
72.
- -strictlength
boolean
- If set to false, a line can exceed the specified -length if
a single word is longer than -length. If set to true, words
that are longer than -length are split so that no line exceeds the
specified -length. Defaults to false.
- ::textutil::adjust::readPatterns filename
- Loads the internal storage for hyphenation patterns with the contents of
the file filename. This has to be done prior to calling command
::textutil::adjust with "-hyphenate true",
or the hyphenation process will not work correctly.
The package comes with a number of predefined pattern files,
and the command ::textutil::adjust::listPredefined can be used to
find out their names.
- ::textutil::adjust::listPredefined
- This command returns a list containing the names of the hyphenation files
coming with this package.
- ::textutil::adjust::getPredefined filename
- Use this command to query the package for the full path name of the
hyphenation file filename coming with the package. Only the
filenames found in the list returned by
::textutil::adjust::listPredefined are legal arguments for this
command.
- ::textutil::indent string prefix ?skip?
- Each line in the string indented by adding the string prefix
at its beginning. The modified string is returned as the result of the
command.
If skip is specified the first skip lines are
left untouched. The default for skip is 0, causing the
modification of all lines. Negative values for skip are treated
like 0. In other words, skip > 0 creates a
hanging indentation.
Together with ::textutil::adjust it is possible to
create properly wrapped paragraphs with arbitrary indentations.
- ::textutil::undent string
- The command computes the common prefix for all lines in string
consisting solely out of whitespace, removes this from each line and
returns the modified string.
Lines containing only whitespace are always reduced to
completely empty lines. They and empty lines are also ignored when
computing the prefix to remove.
Together with ::textutil::adjust it is possible to
create properly wrapped paragraphs with arbitrary indentations.
- ::textutil::splitn string ?len?
- This command splits the given string into chunks of len
characters and returns a list containing these chunks. The argument
len defaults to 1 if none is specified. A negative length is
not allowed and will cause the command to throw an error. Providing an
empty string as input is allowed, the command will then return an empty
list. If the length of the string is not an entire multiple of the
chunk length, then the last chunk in the generated list will be shorter
than len.
- ::textutil::splitx string ?regexp?
- Split the string and return a list. The string is split according
to the regular expression regexp instead of a simple list of chars.
Note that if you add parenthesis into the regexp, the parentheses
part of separator would be added into list as additional element. If the
string is empty the result is the empty list, like for
split. If regexp is empty the string is split at
every character, like split does. The regular expression
regexp defaults to "[\\t \\r\\n]+".
- ::textutil::tabify string ?num?
- Tabify the string by replacing any substring of num space
chars by a tabulation and return the result as a new string. num
defaults to 8.
- ::textutil::tabify2 string ?num?
- Similar to ::textutil::tabify this command tabifies the
string and returns the result as a new string. A different
algorithm is used however. Instead of replacing any substring of
num spaces this command works more like an editor. num
defaults to 8.
Each line of the text in string is treated as if there
are tabstops every num columns. Only sequences of space
characters containing more than one space character and found
immediately before a tabstop are replaced with tabs.
- ::textutil::trim string ?regexp?
- Remove in string any leading and trailing substring according to
the regular expression regexp and return the result as a new
string. This apply on any line in the string, that is any substring
between 2 newline chars, or between the beginning of the string and a
newline, or between a newline and the end of the string, or, if the string
contain no newline, between the beginning and the end of the string. The
regular expression regexp defaults to "[ \\t]+".
- ::textutil::trimleft string ?regexp?
- Remove in string any leading substring according to the regular
expression regexp and return the result as a new string. This apply
on any line in the string, that is any substring between 2 newline
chars, or between the beginning of the string and a newline, or between a
newline and the end of the string, or, if the string contain no newline,
between the beginning and the end of the string. The regular expression
regexp defaults to "[ \\t]+".
- ::textutil::trimright string ?regexp?
- Remove in string any trailing substring according to the regular
expression regexp and return the result as a new string. This apply
on any line in the string, that is any substring between 2 newline
chars, or between the beginning of the string and a newline, or between a
newline and the end of the string, or, if the string contain no newline,
between the beginning and the end of the string. The regular expression
regexp defaults to "[ \\t]+".
- ::textutil::trimPrefix string prefix
- Removes the prefix from the beginning of string and returns
the result. The string is left unchanged if it doesn't have
prefix at its beginning.
- ::textutil::trimEmptyHeading string
- Looks for empty lines (including lines consisting of only whitespace) at
the beginning of the string and removes it. The modified string is
returned as the result of the command.
- ::textutil::untabify string ?num?
- Untabify the string by replacing any tabulation char by a substring
of num space chars and return the result as a new string.
num defaults to 8.
- ::textutil::untabify2 string ?num?
- Untabify the string by replacing any tabulation char by a substring
of at most num space chars and return the result as a new string.
Unlike textutil::untabify each tab is not replaced by a fixed
number of space characters. The command overlays each line in the
string with tabstops every num columns instead and replaces
tabs with just enough space characters to reach the next tabstop. This is
the complement of the actions taken by ::textutil::tabify2.
num defaults to 8.
There is one asymmetry though: A tab can be replaced with a
single space, but not the other way around.
- ::textutil::strRepeat text num
- The implementation depends on the core executing the package. Used
string repeat if it is present, or a fast tcl implementation if it
is not. Returns a string containing the text repeated num
times. The repetitions are joined without characters between them. A value
of num <= 0 causes the command to return an empty string.
- ::textutil::blank num
- A convenience command. Returns a string of num spaces.
- ::textutil::chop string
- A convenience command. Removes the last character of string and
returns the shortened string.
- ::textutil::tail string
- A convenience command. Removes the first character of string and
returns the shortened string.
- ::textutil::cap string
- Capitalizes the first character of string and returns the modified
string.
- ::textutil::uncap string
- The complementary operation to ::textutil::cap. Forces the first
character of string to lower case and returns the modified
string.
- ::textutil::longestCommonPrefixList list
- ::textutil::longestCommonPrefix ?string...?
- Computes the longest common prefix for either the strings given to
the command, or the strings specified in the single list, and
returns it as the result of the command.
If no strings were specified the result is the empty string.
If only one string was specified, the string itself is returned, as it
is its own longest common prefix.
This document, and the package it describes, will undoubtedly
contain bugs and other problems. Please report such in the category
textutil of the Tcllib SF Trackers
[http://sourceforge.net/tracker/?group_id=12883]. Please also report any
ideas for enhancements you may have for either package and/or
documentation.
regexp(n), split(n), string(n)
TeX, formatting, hyphenation, indenting, paragraph, regular
expression, string, trimming