ICON语言是一种高级编程语言,具有强有力的字符串及结构分析和处理能力,适合于解决
人文科学领域中的问题,在欧美许多大学的文科院系,ICON语言作为必修课程。
ICON语言简介
利用ICON程序语言编写软件进行语言学研究
将文件转换为
文本形式内码、笔画、读音、部首等属性信息进行排序,等等。要使计算机完成上述种种任务,离不开各种
应用软件。但是,目前市面上能满足一般语言学研究者进行语言研究需要的软件还不多见,这就要求研究者自己掌握一、两种编程语言,根据自己的需要编写
应用软件。
由
美国亚利桑那大学计算机系开发的ICON语言是一种以字符串和结构处理为特色的程序语言,现在已经有了可以在WINDOWS上运行的版本,而且
清华大学出版社
Introduction
Icon is a high-level programming language with extensive facilities for processing strings and structures. Icon has several novel features, including expressions that may produce sequences of results, goal-directed evaluation that automatically searches for a successful result, and string scanning that allows operations on strings to be formulated at a high conceptual level. Icon also provides high-level graphics facilities.
Icon emphasizes high-level string processing and a design philosophy that allows ease of programming and short, concise programs. Storage allocation and garbage collection are automatic in Icon, and there are few restrictions on the sizes of objects. Strings, lists, and other structures are created during program execution and their size does not need to be known when a program is written. Values are converted to expected types automatically; for example, numeral strings read in as input can be used in numerical computations without explicit conversion. Icon has an expression-based syntax with reserved words; in appearance, Icon programs resemble those of Pascal and C.
Although Icon has extensive facilities for processing strings and structures, it also has a full repertoire of computational facilities. It is suitable for a wide variety of applications. Some examples are:
text analysis
text editing
document formatting
artificial intelligence
expert systems
rapid prototyping
symbolic mathematics
text generation
data laundry
There are public-domain implementations of Icon for the Macintosh, MS-DOS, many UNIX systems, and VAX/VMS. The remainder of this report briefly describes the highlights of Icon. For a complete description, see References 1 and 2.
ExpressionEvaluation
2.1 Conditional Expressions
In Icon there are conditional expressions that may succeed and produce a result, or may fail and not produce any result. An example is the comparison operation
i > j
which succeeds (and produces the value of j) provided that the value of i is greater than the value of j, but fails otherwise. Similarly,
i > j > k
succeeds if the value of j is between i and k.
The success or failure of conditional operations is used instead of Boolean values to drive control structures in Icon. An example is
if i > j then k := i else k := j
which assigns the value of i to k if the value of i is greater than the value of j, but assigns the value of j to k otherwise.
The usefulness of the concepts of success and failure is illustrated by find(s1,s2), which fails if s1 does not occur as a substring of s2. Thus
Many expressions in Icon are conditional. An example is read(), which produces the next line from the input file, but fails when the end of the file is reached. The following expression is typical of programming in Icon and illustrates the integration of conditional expressions and conventional control structures:
while line := read() do
write(line)
This expression copies the input file to the output file.
while write(read())
2.2 Generators
In some situations, an expression may be capable of producing more than one result. Consider
The results that a generator produces depend on context. In a situation where only one result is needed, the first is produced, as in
which assigns the value 3 to i.
If the result produced by a generator does not lead to the success of an enclosing expression, however, the generator is resumed to produce another value. An example is
Here the first result produced by the generator, 3, is assigned to i, but this value is not greater than 5 and the comparison operation fails. At this point, the generator is resumed and produces the second position, 23, which is greater than 5. The comparison operation then succeeds and the value 23 is written. Because of the inheritance of failure and the fact that comparison operations return the value of their right argument, this expression can be written in the following more compact form:
Goal-directed evaluation is inherent in the expression evaluation mechanism of Icon and can be used in arbitrarily complicated situations. For example,
A generator can be resumed repeatedly to produce all its results by using the every-do control structure. An example is
do write(i)
Generation is inherited like failure, and this expression can be written more concisely by omitting the optional do clause:
There are several built-in generators in Icon. One of the most frequently used of these is
i to j
which generates the integers from i to j. This generator can be combined with every-do to formulate the traditional for-style control structure:
every k := i to j do
f(k)
Note that this expression can be written more compactly as
every f(i to j)
There are several other control structures related to generation. One is alternation,
expr1 | expr2
which generates the results of expr1 followed by the results of expr2. Thus
Another use of alternation is illustrated by
(i | j | k) = (0 | 1)
which succeeds if any of i, j, or k has the value 0 or 1.
Procedures can be used to add generators to Icon's built-in repertoire. For example,
procedure findodd(s1, s2)
every i := find(s1, s2) do
if i % 2 = 1 then suspend i
end
is a procedure that generates the odd-valued positions at which s1 occurs in s2. The suspend control structure returns a value from the procedure, but leaves it in suspension so that it can be resumed for another value. When the loop terminates, control flows off the end of the procedure without producing another value.
String Scanning
For complicated operations, the bookkeeping involved in keeping track of positions in strings becomes burdensome and error prone. Icon has a string scanning facility that is manages positions automatically. Attention is focused on a current position in a string as it is examined by a sequence of operations.
The string scanning operation has the form
s ? expr
where s is the subject string to be examined and expr is an expression that performs the examination. A position in the subject, which starts at 1, is the focus of examination.
Matching functions change this position. One matching function, move(i), moves the position by i and produces the substring of the subject between the previous and new positions. If the position cannot be moved by the specified amount (because the subject is not long enough), move(i) fails. A simple example is
line ? while write(move(2))
which writes successive two-character substrings of line, stopping when there are no more characters.
Another matching function is tab(i), which sets the position in the subject to i and also returns the substring of the subject between the previous and new positions. For example,
line ? if tab(10) then write(tab(0))
first sets the position in the subject to 10 and then to the end of the subject, writing the remaining characters. Note that no value is written if the subject is not long enough.
String analysis functions such as find() can be used in string scanning. In this context, the string that they operate on is not specified and is taken to be the subject. For example,
do move(2)
Another example of the use of string analysis functions in scanning is
line ? while tab(upto(&letters)) do
write(tab(many(&letters)))
which writes all the words in line.
As illustrated in the examples above, any expression may occur in the scanning expression.
Structures
Icon supports several kinds of structures with different organizations and access methods. Lists are linear structures that can be accessed both by position and by stack and queue functions. Sets are collections of arbitrary values with no implied ordering. Tables provide an associative lookup mechanism.
4.1 Lists
While strings are sequences of characters, lists in Icon are sequences of values of arbitrary types. Lists are created by enclosing the lists of values in brackets. An example is
in which the list car1 has four values, two of which are strings and two of which are integers. Note that the values in a list need not all be of the same type. In fact, any kind of value can occur in a list -- even another list, as in
inventory := [car1, car2, car3, car4]
Lists also can be created by
L := list(i, x)
which creates a list of i values, each of which has the value x.
The values in a list can be referenced by position. Thus,
car1[4] := 2400
changes the last value in car1 to 2400. A reference that is out of the range of the list fails. For example,
write(car1[5])
fails.
The values in a list L are generated by !L. Thus
every write(!L)
writes all the values in L.
Lists can be manipulated like stacks and queues. The function push(L, x) adds the value of x to the left end of the list L, automatically increasing the size of L by one. Similarly, pop(L) removes the leftmost value from L, automatically decreasing the size of L by one, and produces the removed value.
4.2 Sets
A set is a collection of values. An empty set is created by set(). Alternatively, set(L) produces a set with the values in the list L. For example,
The set operations of union, intersection, and difference are provided. The function member(S, x) succeeds if x is a member of the set S but fails otherwise. The function insert(S, x) adds x to the set S, while delete(S, x) removes x from S. A value only can occur once in a set, so insert(S, x) has no effect if x is already in S. !S generates the members of S.
A simple example of the use of sets is given by the following segment of code, which lists all the different words that appear in the input file:
words := set()
while line := read() do
line ? while tab(upto(&letters)) do
insert(words, tab(many(&letters)))
every write(!words)
4.3 Tables
Tables are sets of pairs each of which consists of a key and a corresponding value. The key and its corresponding value may be of any type, and the value for any key can be looked up automatically. Thus, tables provide a form of associative access in contrast with the positional access to values in lists.
A table is created by an expression such as
symbols := table(0)
which assigns to symbols a table with the default value 0. The default value is used for keys that are not assigned another value. Subsequently, symbols can be referenced by any key, such as
Tables grow automatically as new keys are added. For example, the following program segment produces a table containing a count of the words that appear in the input file:
words := table(0)
while line := read() do
line ? while tab(upto(&letters)) do
words[tab(many(&letters))] +:= 1
Here the default value for each word is 0, as given in table(0), and +:= is an augmented assignment operation used to increment the values. There are augmented assignment operations for all binary operators.
A list can be obtained from a table T by the function sort(T, 1). The form of the list depends on the value of i. For example, if i is 3, the list contains alternate keys and their corresponding values in T. For example,
wordlist := sort(words, 3)
writes the words and their counts from the table words.
Graphics Facilities
Icon supports high-level graphics facilities:
Windows can be opened and closed as desired.
Text can be written to windows in a variety of type faces and sizes, including proportional-width faces.
Characters from the keyboards can be processed as they are typed.
Points, lines, polygons, circles, ares, and smooth curves can be freely mixed with text.
Colors can be used for both text and graphics.
Image files can be read and written.
For more information, see Reference 3.
An Example
The following program, which produces a concordance of the words from an input file, illustrates typical Icon programming techniques. Although not all the features in this program are described in previous sections, the general idea should be clear.
global uses, lineno, width
procedure main(args)
width := 15 # width of word field
uses := table()
lineno := 0
every tabulate(words()) # tabulate all citations
output() # print the citations
end
# Add line number to citations for word
#
procedure tabulate(word)
/uses[word] := set()
insert(uses[word], lineno)
return
end
# Generate words
#
procedure words()
while line := read() do {
lineno +:= 1
map(line) ? while tab(upto(&letters)) do {
s := tab(many(&letters))
if *s >= 3 then suspend s# skip short words
}
}
end
# Print the results
#
procedure output()
write() # blank line
uses := sort(uses, 3) # sort citations
while word := get(uses) do {
numbers := sort(get(uses))
write(left(word, width), line[1:-2])
}
end
After all the input has been read, the table of words is sorted by key. Each corresponding set of line numbers is sorted and the line numbers are appended to the line to be written.
For example, if the input file is
On the Future!-how it tells
Of the rapture that impells
To the swinging and the ringing
Of the bells, bells, bells-
Of the bells, bells, bells, bells,
Bells, bells, bells-
To the rhyming and the chiming of the bells!
the output is
1 On the Future!-how it tells
2 Of the rapture that impells
3 To the swinging and the ringing
4 Of the bells, bells, bells-
5 Of the bells, bells, bells, bells,
6 Bells, bells, bells-
7 To the rhyming and the chiming of the bells!
and 3, 7
bells 4, 5, 6, 7
chiming 7
future 1
how 1
impells 2
rapture 2
rhyming 7
ringing 3
swinging 3
tells 1
that 2
the 1, 2, 3, 4, 5, 7
Acknowledgement
Icon was designed by the the author in collaboration with Dave Hanson, Tim Korb, Cary Coutant, and Steve Wampler. Many other persons have contributed to its development. The current implementation is based on the work of Cary Coutant and Steve Wampler with recent contributions by Bill Mitchell, Janalee O'Bagy, Gregg Townsend, Ken Walker, and Clint Jeffery.
References
1. R. E. Griswold and M. T. Griswold, The Icon Programming Language, Prentice-Hall, Inc., Englewood Cliffs, NJ, second edition, 1990.
2. R. E. Griswold, C. L. Jeffery and G. M. Townsend, Version 9.3 of the Icon Programming Language, The Univ. of Arizona Icon Project Document IPD278, 1995.
3. G. M. Townsend, R. E. Griswold and C. L. Jeffery, Graphics Facilities for the Icon Programming Language; Version 9.3, The Univ. of Arizona Icon Project Document IPD281, 1995.