NET-CONFIG: Specify weight configuration for a layer's connections By default, units in one layer of a network are fully connected to units in a layer that feeds into it. Alternatively, for hidden and output layers, one can use a configuration file to specify what connections exist from the previous hidden layer (or any hidden layer for outputs) or from the input layer, as well as any sharing of weights between these connections. Similarly, biases for hidden layers and the outputs may be configured with sharing or sparsity. The configuration files to use are specified by "config:" options in the network specification. See net-spec.doc for details. The net-config-check command can be used to check for errors and display information regarding a configuration file, as described in net-config-check.doc. Configuration files can be written in various ways, as given by the detailed syntax at the end of this document. The simplest form, as a sequence of source-destination-weight triplets, is presented first below, followed by increasingly elaborate forms. Configuration files for biases have the same form as those for input-to-hidden, hidden-to-hidden, or hidden-to-output connections, except the source-destination-weight triplets are replaced by unit-bias doublets. This configuration file (which may be the output of a command, if specified with "%", see net-spec.doc) can contain one or more triplets of the form <source-unit> <destination-unit> <weight-index> These elements may be positive integers. The source unit ranges from 1 to the number of units in the source layer (either the input layer or a hidden layer). The destination unit ranges from 1 to the number of units in the hidden or output layer this configuration applies to. The weight-index ranges from 1 to the number of weights used, which may be less (or, unusally, more) than the number of connections, since weights may be shared. As an example, suppose the following network specification is used: net-spec log 2 3 cfg-i:conf 1 / ih=2 ho=3 and the file "conf" contains the following: 1 1 1 1 2 2 2 1 2 1 3 1 # The last one! Note that the line breaks are not significant, but help readability. A '#' causes the rest of the line to be ignored as a comment. This network will have two parameters for input-to-hidden weights, with the first parameter being used for the connections from the first input to the first and third hidden unit, and the second parameter being used for the connection from the first input to the second hidden unit, and from the second input to the first hidden unit. A connection from one unit to another may be associated with zero, one, or more than one weight, with multiple weights having an additive effect. If no weight is associated with a connection, it is effectivly absent from the network. Indexes for units or weights may be specified relative to the values in the previous triplet, using the form +offset or -offset. A "+" with no offset is the same as "+1", and a "-" with no offset is the same as "-1". An index specifed as "=" is the same as the previous index. The previous index is considered to be 0 at the start of the file. For example, the following configuration file has the same effect as the one above: + + 1 = + + + -1 = 1 +2 - A number may also be written in parts separated by "+" or "-". For example, the last item above could be written as follows: 7-6 +102-100 - or as follows: 1 +20+30-48 -10+9 A group of triplets may be repeated by enclosing it in items of the form "<rep>(" and ")", where <rep> is a positive integer (defaulting to 1 if absent). Such repeated groups may be nested. A group of triplets enclosed by "<rep>[" and "]" will update the indexes used for relative values, but without producing any connections. A group of triplets enclosed by "<rep>{" and "}" will act like one enclosed by "<rep>(" and ")" except that, after all repetitions, the indexes used for relative values will be restored to ther values before the repeat group. The utility of relative specifications and grouping is illustrated by the following network specification, which sets up a network that computes a convolution of an input sequence of length 10 with a 3-point linear filter, which is then used in a logistic regression model for binary classification: net-spec log 10 8 identity cfg-i:filter 1 / ih=1 ho=1 bo=1 model-spec log binary where the file "filter" contains 1 1 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + - + 1 + = + + = + Note that there are only three input-hidden weights in this model, although there are 24 input-hidden connections in use, and 80 possible input-hidden connections. This configuration file can be abbreviated by using the repetition facility, as follows: 1 1 1 2( + = + ) 7( - + 1 2( + = + ) ) Using no-connection groups, this can be further abbreviated to [ 2 0 0 ] 8( [ -2 + 0 ] 3( + = + ) ) Another way to specify the same configuration is as follows: 8( + + 1 2{ + = + } ) Numbers may be specified using single-letter symbols, to which values are assigned by items of the form <letter>=<number>. Numbers may also be combined with "+" or "-". For example, the above may instead be written as R=8 s=3 R( + + 1 s-1{ + = + } ) Letters have the value zero before any assignment, and may be assigned more than once - for example, a=3 a=a+1 is allowed, and leaves a with the value 4. Finally, an item of "@" (not in the middle of a triplet) will print n @ s d w on standard error, with n being an index from 1 up, and s, d, and w being the current source unit, destination unit, and weight index. This is for use in debugging configuration files. The way in which triplets defining connections are produced by the configuration file has no effect on the performance of network operations. (There may be a negligible effect on program startup time.) However, the way in which weights are associated with indexes can have an effect on performance. For example, it may be beneficial for the indexes of weights on connections to consecutive units to be consecutive. The complete syntax for a configuration file can be written as follows, with <.> for syntactic categories, | for alternatives, [.] for optional parts, {.} for parts repeated zero or more times, and double quotes for parts written literally: <configuration> : { <part> } <part> : <triplet> | [<number>]"(" { <part> } ")" | [<number>]"{" { <part> } "}" | [<number>]"[" { <part> } "]" | <letter>=<number> | "@" <triplet> : <index> <index> <index> <index> : <number> | <sign>[<number>] <number> : <term>{<sign><term>} <term> : <digits> | <letter> <digits> : <digit>{<digit>} <digits> : "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" <letter> : "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z" | "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" Whitespace above corresponds to whitespace (spaces or newlines) in the file, and lack of whitespace above corresponds to no whitespace in the file. For example, a repeated triplet must be written like 4( + + + ), not like 4(+ + +) or 4 ( + + + ), and assignments must be written like a=123, not like a = 123. Configuration files for biases have the same syntax, except <triplet> is replaced by <doublet>, defined as <doublet>: <index> <index> Copyright (c) 2021 by Radford M. Neal