Section 1029: The chief executive
We come now to the main_control routine, which contains the master switch that causes all the various pieces of to do their things, in the right order.
In a sense, this is the grand climax of the program: It applies all the tools that we have worked so hard to construct. In another sense, this is the messiest part of the program: It necessarily refers to other pieces of code all over the place, so that a person can’t fully understand what is going on without paging back and forth to be reminded of conventions that are defined elsewhere. We are now at the hub of the web, the central nervous system that touches most of the other parts and ties them together.
The structure of main_control itself is quite simple. There’s a label called big_switch, at which point the next token of input is fetched using get_x_token. Then the program branches at high speed into one of about 100 possible directions, based on the value of the current mode and the newly fetched command code; the sum abs(mode) + cur_cmd indicates what to do next. For example, the case ‘VMODE + LETTER’ arises when a letter occurs in vertical mode (or internal vertical mode); this case leads to instructions that initialize a new paragraph and enter horizontal mode.
The big case statement that contains this multiway switch has been labeled reswitch, so that the program can goto reswitch when the next token has already been fetched. Most of the cases are quite short; they call an “action procedure” that does the work for that case, and then they either goto reswitch or they “fall through” to the end of the case statement, which returns control back to big_switch. Thus, main_control is not an extremely large procedure, in spite of the multiplicity of things it must do; it is small enough to be handled by Pascal compilers that put severe restrictions on procedure size.
One case is singled out for special treatment, because it accounts for most of ’s activities in typical applications. The process of reading simple text and converting it into CHAR_NODE records, while looking for ligatures and kerns, is part of ’s “inner loop”; the whole program runs efficiently when its inner loop is fast, so this part has been written with particular care.
Section 1030
We shall concentrate first on the inner loop of main_control, deferring consideration of the other cases until later.
big_switch | go here to branch on the next token of input; |
main_loop | go here to typeset a string of consecutive characters; |
main_loop_wrapup | go here to finish a character or ligature; |
main_loop_move | go here to advance the ligature cursor; |
main_loop_move_lig | same, when advancing past a generated ligature; |
main_loop_lookahead | go here to bring in another character, if any; |
main_lig_loop | go here to check for ligatures or kerning; |
append_normal_space | go here to append a normal space between words. |
// << Start file |chief.c|, 1382 >>
// << Declare action procedures for use by |main_control|, 1043 >>
// governs TeX's activities
void main_control() {
int t; // general-purpose temporary variable
if (every_job != null) {
begin_token_list(every_job, EVERY_JOB_TEXT);
}
big_switch:
get_x_token();
reswitch:
// << Give diagnostic information, if requested, 1031 >>
switch (abs(mode) + cur_cmd) {
case HMODE + LETTER:
case HMODE + OTHER_CHAR:
case HMODE + CHAR_GIVEN:
goto main_loop;
case HMODE + CHAR_NUM:
scan_char_num();
cur_chr = cur_val;
goto main_loop;
case HMODE + NO_BOUNDARY:
get_x_token();
if (cur_cmd == LETTER
|| cur_cmd == OTHER_CHAR
|| cur_cmd == CHAR_GIVEN
|| cur_cmd == CHAR_NUM)
{
cancel_boundary = true;
}
goto reswitch;
case HMODE + SPACER:
if (space_factor == 1000) {
goto append_normal_space;
}
else {
app_space();
}
break;
case HMODE + EX_SPACE:
case MMODE + EX_SPACE:
goto append_normal_space;
// << Cases of |main_control| that are not part of the inner loop, 1045 >>
} // end of the big |case| statement
goto big_switch;
main_loop:
// << Append character |cur_chr| and the following characters (if any) to the current hlist in the current font; |goto reswitch| when a non-character has been fetched, 1034 >>
append_normal_space:
// << Append a normal inter-word space to the current list, then |goto big_switch|, 1041 >>
}
Section 1031
When a new token has just been fetched at big_switch, we have an ideal place to monitor ’s activity.
⟨ Give diagnostic information, if requested 1031 ⟩≡
if (interrupt != 0 && ok_to_interrupt) {
back_input();
check_interrupt;
goto big_switch;
}
#ifdef DEBUG
if (panicking) {
check_mem(false);
}
#endif
if (tracing_commands > 0) {
show_cur_cmd_chr();
}
Section 1032
The following part of the program was first written in a structured manner, according to the philosophy that “premature optimization is the root of all evil”. Then it was rearranged into pieces of spaghetti so that the most common actions could proceed with little or no redundancy.
The original unoptimized form of this algorithm resembles the reconstitute procedure, which was described earlier in connection with hyphenation. Again we have an implied “cursor” between characters cur_l and cur_r. The main difference is that the lig_stack can now contain a charnode as well as pseudo-ligatures; that stack is now usually nonempty, because the next character of input (if any) has been appended to it. In main_control we have
except when character(lig_stack) = font_false_bchar[cur_font]. Several additional global variables are needed.
⟨ Global variables 13 ⟩+≡
internal_font_number main_f; // the current font
memory_word main_i; // character information bytes for |cur_l|
memory_word main_j; // ligature/kern command
int main_k; // index into |font_info|
pointer main_p; // temporary register for list manipulation
int main_s; // space factor value
halfword bchar; // boundary character of current font, or |NON_CHAR|
halfword false_bchar; // nonexistent character matching |bchar|, or |NON_CHAR|
bool cancel_boundary; // should the left boundary be ignored?
bool ins_disc; // should we insert a discretionary node?
Section 1033
The boolean variables of the main loop are normally false, and always reset to false before the loop is left. That saves us the extra work of initializing each time.
⟨ Set initial values of key variables 21 ⟩+≡
ligature_present = false;
cancel_boundary = false;
lft_hit = false;
rt_hit = false;
ins_disc = false;
Section 1034
We leave the space_factor unchanged if sf_code(cur_chr) = 0; otherwise we set it equal to sf_code(cur_chr), except that it should never change from a value less than 1000 to a value exceeding 1000. The most common case is sf_code(cur_chr) = 1000, so we want that case to be fast.
The overall structure of the main loop is presented here. Some program labels are inside the individual sections.
#define adjust_space_factor \
do { \
main_s = sf_code(cur_chr); \
if (main_s == 1000) { \
space_factor = 1000; \
} \
else if (main_s < 1000) { \
if (main_s > 0) { \
space_factor = main_s; \
} \
} \
else if (space_factor < 1000) { \
space_factor = 1000; \
} \
else { \
space_factor = main_s; \
} \
} while (0)
⟨ Append character cur_chr and the following characters (if any) to the current hlist in the current font; goto reswitch when a non-character has been fetched 1034 ⟩≡
adjust_space_factor;
main_f = cur_font;
bchar = font_bchar[main_f];
false_bchar = font_false_bchar[main_f];
if (mode > 0 && language != clang) {
fix_language();
}
fast_get_avail(lig_stack);
font(lig_stack) = main_f;
cur_l = cur_chr;
character(lig_stack) = cur_l;
cur_q = tail;
if (cancel_boundary) {
cancel_boundary = false;
main_k = NON_ADDRESS;
}
else {
main_k = bchar_label[main_f];
}
if (main_k == NON_ADDRESS) {
goto main_loop_move_2; // no left boundary processing
}
cur_r = cur_l;
cur_l = NON_CHAR;
goto main_lig_loop_1; // begin with cursor after left boundary
main_loop_wrapup:
// << Make a ligature node, if |ligature_present|; insert a null discretionary, if appropriate, 1035 >>
main_loop_move:
// << If the cursor is immediately followed by the right boundary, |goto reswitch|; if it's followed by an invalid character, |goto big_switch|; otherwise move the cursor one step to the right and |goto main_lig_loop|, 1036 >>
main_loop_lookahead:
// << Look ahead for another character, or leave |lig_stack| empty if there's none there, 1038 >>
main_lig_loop:
// << If there's a ligature/kern command relevant to |cur_l| and |cur_r|, adjust the text appropriately; exit to |main_loop_wrapup|, 1039 >>
main_loop_move_lig:
// << Move the cursor past a pseudo-ligature, then |goto main_loop_lookahead| or |main_lig_loop|, 1037 >>
Section 1035
If link(cur_q) is nonnull when wrapup is invoked, cur_q points to the list of characters that were consumed while building the ligature character cur_l.
A discretionary break is not inserted for an explicit hyphen when we are in restricted horizontal mode. In particular, this avoids putting discretionary nodes inside of other discretionaries.
// the parameter is either |rt_hit| or |false|
#define pack_lig(X) \
main_p = new_ligature(main_f, cur_l, link(cur_q)); \
if (lft_hit) { \
subtype(main_p) = 2; \
lft_hit = false; \
} \
if ((X) && lig_stack == null) { \
incr(subtype(main_p)); \
rt_hit = false; \
} \
link(cur_q) = main_p; \
tail = main_p; \
ligature_present = false
#define wrapup(X) \
do { \
if (cur_l < NON_CHAR) { \
if (link(cur_q) > null \
&& character(tail) == hyphen_char[main_f]) \
{ \
ins_disc = true; \
} \
if (ligature_present) { \
pack_lig((X)); \
} \
if (ins_disc) { \
ins_disc = false; \
if (mode > 0) { \
tail_append(new_disc()); \
} \
} \
} \
} while (0)
⟨ Make a ligature node, if ligature_present; insert a null discretionary, if appropriate 1035 ⟩≡
wrapup(rt_hit);
Section 1036
⟨ If the cursor is immediately followed by the right boundary, goto reswitch; if it’s followed by an invalid character, goto big_switch; otherwise move the cursor one step to the right and goto main_lig_loop 1036 ⟩≡
if (lig_stack == null) {
goto reswitch;
}
cur_q = tail;
cur_l = character(lig_stack);
main_loop_move_1:
if (!is_char_node(lig_stack)) {
goto main_loop_move_lig;
}
main_loop_move_2:
if (cur_chr < font_bc[main_f] || cur_chr > font_ec[main_f]) {
char_warning(main_f, cur_chr);
free_avail(lig_stack);
goto big_switch;
}
main_i = char_info(main_f, cur_l);
if (!char_exists(main_i)) {
char_warning(main_f, cur_chr);
free_avail(lig_stack);
goto big_switch;
}
link(tail) = lig_stack;
tail = lig_stack; // |main_loop_lookahead| is next
Section 1037
Here we are at main_loop_move_lig. When we begin this code we have cur_q = tail and cur_l = character(lig_stack).
⟨ Move the cursor past a pseudo-ligature, then goto main_loop_lookahead or main_lig_loop 1037 ⟩≡
main_p = lig_ptr(lig_stack);
if (main_p > null) {
tail_append(main_p); // append a single character
}
temp_ptr = lig_stack;
lig_stack = link(temp_ptr);
free_node(temp_ptr, SMALL_NODE_SIZE);
main_i = char_info(main_f, cur_l);
ligature_present = true;
if (lig_stack == null) {
if (main_p > null) {
goto main_loop_lookahead;
}
else {
cur_r = bchar;
}
}
else {
cur_r = character(lig_stack);
}
goto main_lig_loop;
Section 1038
The result of \char
can participate in a ligature or kern, so we must look ahead for it.
⟨ Look ahead for another character, or leave lig_stack empty if there’s none there 1038 ⟩≡
get_next(); // set only |cur_cmd| and |cur_chr|, for speed
if (cur_cmd == LETTER
|| cur_cmd == OTHER_CHAR
|| cur_cmd == CHAR_GIVEN)
{
goto main_loop_lookahead_1;
}
x_token(); // now expand and set |cur_cmd|, |cur_chr|, |cur_tok|
if (cur_cmd == LETTER
|| cur_cmd == OTHER_CHAR
|| cur_cmd == CHAR_GIVEN)
{
goto main_loop_lookahead_1;
}
if (cur_cmd == CHAR_NUM) {
scan_char_num();
cur_chr = cur_val;
goto main_loop_lookahead_1;
}
if (cur_cmd == NO_BOUNDARY) {
bchar = NON_CHAR;
}
cur_r = bchar;
lig_stack = null;
goto main_lig_loop;
main_loop_lookahead_1:
adjust_space_factor;
fast_get_avail(lig_stack);
font(lig_stack) = main_f;
cur_r = cur_chr;
character(lig_stack) = cur_r;
if (cur_r == false_bchar) {
cur_r = NON_CHAR; // this prevents spurious ligatures
}
Section 1039
Even though comparatively few characters have a lig/kern program, several of the instructions here count as part of ’s inner loop, since a potentially long sequential search must be performed. For example, tests with Computer Modern Roman showed that about 40 per cent of all characters actually encountered in practice had a lig/kern program, and that about four lig/kern commands were investigated for every such character.
At the beginning of this code we have main_i = char_info(main_f, cur_l).
⟨ If there’s a ligature/kern command relevant to cur_l and cur_r, adjust the text appropriately; exit to main_loop_wrapup 1039 ⟩≡
if (char_tag(main_i) != LIG_TAG ||cur_r == NON_CHAR) {
goto main_loop_wrapup;
}
main_k = lig_kern_start(main_f, main_i);
main_j = font_info[main_k];
if (skip_byte(main_j) <= STOP_FLAG) {
goto main_lig_loop_2;
}
main_k = lig_kern_restart(main_f, main_j);
main_lig_loop_1:
main_j = font_info[main_k];
main_lig_loop_2:
if (next_char(main_j) == cur_r && skip_byte(main_j) <= STOP_FLAG) {
// << Do ligature or kern command, returning to |main_lig_loop| or |main_loop_wrapup| or |main_loop_move|, 1040 >>
}
if (skip_byte(main_j) == 0) {
incr(main_k);
}
else {
if (skip_byte(main_j) >= STOP_FLAG) {
goto main_loop_wrapup;
}
main_k += skip_byte(main_j) + 1;
}
goto main_lig_loop_1;
Section 1040
When a ligature or kern instruction matches a character, we know from read_font_info that the character exists in the font, even though we haven’t verified its existence in the normal way.
This section could be made into a subroutine, if the code inside main_control needs to be shortened.
⟨ Do ligature or kern command, returning to main_lig_loop or main_loop_wrapup or main_loop_move 1040 ⟩≡
if (op_byte(main_j) >= KERN_FLAG) {
wrapup(rt_hit);
tail_append(new_kern(char_kern(main_f, main_j)));
goto main_loop_move;
}
if (cur_l == NON_CHAR) {
lft_hit = true;
}
else if (lig_stack == null) {
rt_hit = true;
}
check_interrupt; // allow a way out in case there's an infinite ligature loop
switch (op_byte(main_j)) {
case 1:
case 5:
// =:|, =:|>
cur_l = rem_byte(main_j);
main_i = char_info(main_f, cur_l);
ligature_present = true;
break;
case 2:
case 6:
// |=:, |=:>
cur_r = rem_byte(main_j);
if (lig_stack == null) {
// right boundary character is being consumed
lig_stack = new_lig_item(cur_r);
bchar = NON_CHAR;
}
else if (is_char_node(lig_stack)) {
// |link(lig_stack) = null|
main_p = lig_stack;
lig_stack = new_lig_item(cur_r);
lig_ptr(lig_stack) = main_p;
}
else {
character(lig_stack) = cur_r;
}
break;
case 3:
// |=:|
cur_r = rem_byte(main_j);
main_p = lig_stack;
lig_stack = new_lig_item(cur_r);
link(lig_stack) = main_p;
break;
case 7:
case 11:
// |=:|>, |=:|>>
wrapup(false);
cur_q = tail;
cur_l = rem_byte(main_j);
main_i = char_info(main_f, cur_l);
ligature_present = true;
break;
default:
// =:
cur_l = rem_byte(main_j);
ligature_present = true;
if (lig_stack == null) {
goto main_loop_wrapup;
}
else {
goto main_loop_move_1;
}
}
if (op_byte(main_j) > 4 && op_byte(main_j) != 7) {
goto main_loop_wrapup;
}
if (cur_l < NON_CHAR) {
goto main_lig_loop;
}
main_k = bchar_label[main_f];
goto main_lig_loop_1;
Section 1041
The occurrence of blank spaces is almost part of ’s inner loop, since we usually encounter about one space for every five non-blank characters. Therefore main_control gives second-highest priority to ordinary spaces.
When a glue parameter like \spaceskip
is set to ‘0pt
’, we will see to it later that the corresponding glue specification is precisely ZERO_GLUE, not merely a pointer to some specification that happens to be full of zeroes.
Therefore it is simple to test whether a glue parameter is zero or not.
⟨ Append a normal inter-word space to the current list, then goto big_switch 1041 ⟩≡
if (space_skip == ZERO_GLUE) {
// << Find the glue specification, |main_p|, for text spaces in the current font, 1042 >>
temp_ptr = new_glue(main_p);
}
else {
temp_ptr = new_param_glue(SPACE_SKIP_CODE);
}
link(tail) = temp_ptr;
tail = temp_ptr;
goto big_switch;
Section 1042
Having font_glue allocated for each text font saves both time and memory.
If any of the three spacing parameters are subsequently changed by the use of \fontdimen
, the find_font_dimen procedure deallocates the font_glue specification allocated here.
⟨ Find the glue specification, main_p, for text spaces in the current font 1042 ⟩≡
main_p = font_glue[cur_font];
if (main_p == null) {
main_p = new_spec(ZERO_GLUE);
main_k = param_base[cur_font] + SPACE_CODE;
width(main_p) = font_info[main_k].sc; // that's |space(cur_font)|
stretch(main_p) = font_info[main_k + 1].sc; // and |space_stretch(cur_font)|
shrink(main_p) = font_info[main_k + 2].sc; // and |space_shrink(cur_font)|
font_glue[cur_font] = main_p;
}
Section 1043
⟨ Declare action procedures for use by main_control 1043 ⟩≡
// handle spaces when |space_factor != 1000|
void app_space() {
pointer q; // glue node
if (space_factor >= 2000 && xspace_skip != ZERO_GLUE) {
q = new_param_glue(XSPACE_SKIP_CODE);
}
else {
if (space_skip != ZERO_GLUE) {
main_p = space_skip;
}
else {
// << Find the glue specification, |main_p|, for text spaces in the current font, 1042 >>
}
main_p = new_spec(main_p);
// << Modify the glue specification in |main_p| according to the space factor, 1044 >>
q = new_glue(main_p);
glue_ref_count(main_p) = null;
}
link(tail) = q;
tail = q;
}
Section 1044
⟨ Modify the glue specification in main_p according to the space factor 1044 ⟩≡
if (space_factor >= 2000) {
width(main_p) += extra_space(cur_font);
}
stretch(main_p) = xn_over_d(stretch(main_p), space_factor, 1000);
shrink(main_p) = xn_over_d(shrink(main_p), 1000, space_factor);
Section 1045
Whew—that covers the main loop. We can now proceed at a leisurely pace through the other combinations of possibilities.
// for mode-independent commands
#define any_mode(X) \
case VMODE + (X): \
case HMODE + (X): \
case MMODE + (X)
⟨ Cases of main_control that are not part of the inner loop 1045 ⟩≡
any_mode(RELAX):
case VMODE + SPACER:
case MMODE + SPACER:
case MMODE + NO_BOUNDARY:
do_nothing;
break;
any_mode(IGNORE_SPACES):
// << Get the next non-blank non-call token, 406 >>
goto reswitch;
case VMODE + STOP:
if (its_all_over()) {
// this is the only way out
return;
}
break;
// << Forbidden cases detected in |main_control|, 1048 >>
any_mode(MAC_PARAM):
report_illegal_case();
break;
// << Math-only cases in non-math modes, or vice versa, 1046 >>
insert_dollar_sign();
break;
// << Cases of |main_control| that build boxes and lists, 1056 >>
// << Cases of |main_control| that don't depend on |mode|, 1210 >>
// << Cases of |main_control| that are for extensions to TeX, 1347 >>
Section 1046
Here is a list of cases where the user has probably gotten into or out of math mode by mistake. will insert a dollar sign and rescan the current token.
#define non_math(X) \
case VMODE + (X): \
case HMODE + (X)
⟨ Math-only cases in non-math modes, or vice versa 1046 ⟩≡
non_math(SUP_MARK):
non_math(SUB_MARK):
non_math(MATH_CHAR_NUM):
non_math(MATH_GIVEN):
non_math(MATH_COMP):
non_math(DELIM_NUM):
non_math(LEFT_RIGHT):
non_math(ABOVE):
non_math(RADICAL):
non_math(MATH_STYLE):
non_math(MATH_CHOICE):
non_math(VCENTER):
non_math(NON_SCRIPT):
non_math(MKERN):
non_math(LIMIT_SWITCH):
non_math(MSKIP):
non_math(MATH_ACCENT):
case MMODE + ENDV:
case MMODE + PAR_END:
case MMODE + STOP:
case MMODE + VSKIP:
case MMODE + UN_VBOX:
case MMODE + VALIGN:
case MMODE + HRULE:
Section 1047
⟨ Declare action procedures for use by main_control 1043 ⟩+≡
void insert_dollar_sign() {
back_input();
cur_tok = MATH_SHIFT_TOKEN + '$';
print_err("Missing $ inserted");
help2("I've inserted a begin-math/end-math symbol since I think")
("you left one out. Proceed, with fingers crossed.");
ins_error();
}
Section 1048
When erroneous situations arise, usually issues an error message specific to the particular error.
For example, ‘\noalign
’ should not appear in any mode, since it is recognized by the align_peek routine in all of its legitimate appearances; a special error message is given when ‘\noalign
’ occurs elsewhere.
But sometimes the most appropriate error message is simply that the user is not allowed to do what he or she has attempted.
For example, ‘\moveleft
’ is allowed only in vertical mode, and ‘\lower
’ only in non-vertical modes.
Such cases are enumerated here and in the other sections referred to under ‘See also .’
⟨ Forbidden cases detected in main_control 1048 ⟩≡
case VMODE + VMOVE:
case HMODE + HMOVE:
case MMODE + HMOVE:
any_mode(LAST_ITEM):
Section 1049
The ‘you_cant’ procedure prints a line saying that the current command is illegal in the current mode; it identifies these things symbolically.
void you_cant() {
print_err("You can't use `");
print_cmd_chr(cur_cmd, cur_chr);
print("' in ");
print_mode(mode);
}
Section 1050
⟨ Declare action procedures for use by main_control 1043 ⟩+≡
void report_illegal_case() {
you_cant();
help4("Sorry, but I'm not programmed to handle this case;")
("I'll just pretend that you didn't ask for it.")
("If you're in the wrong mode, you might be able to")
("return to the right one by typing `I}' or `I$' or `I\\par'.");
error();
}
Section 1051
Some operations are allowed only in privileged modes, i.e., in cases that mode 0. The privileged function is used to detect violations of this rule; it issues an error message and returns false if the current mode is negative.
⟨ Declare action procedures for use by main_control 1043 ⟩+≡
bool privileged() {
if (mode > 0) {
return true;
}
else {
report_illegal_case();
return false;
}
}
Section 1052
Either \dump
or \end
will cause main_control to enter the endgame, since both of them have ‘STOP’ as their command code.
⟨ Put each of TeX’s primitives into the hash table 226 ⟩+≡
primitive("end", STOP, 0);
primitive("dump", STOP, 1);
Section 1053
⟨ Cases of print_cmd_chr for symbolic printing of primitives 227 ⟩+≡
case STOP:
if (chr_code == 1) {
print_esc("dump");
}
else {
print_esc("end");
}
break;
Section 1054
We don’t want to leave main_control immediately when a STOP command is sensed, because it may be necessary to invoke an \output
routine several times before things really grind to a halt.
(The output routine might even say ‘\gdef\end{...}
’, to prolong the life of the job.)
Therefore its_all_over is true only when the current page and contribution list are empty, and when the last output was not a “dead cycle”.
⟨ Declare action procedures for use by main_control 1043 ⟩+≡
// do this when \end or \dump occurs
bool its_all_over() {
if (privileged()) {
if (PAGE_HEAD == page_tail && head == tail && dead_cycles == 0) {
return true;
}
back_input(); // we will try to end again after ejecting residual material
tail_append(new_null_box());
width(tail) = hsize;
tail_append(new_glue(FILL_GLUE));
tail_append(new_penalty(-0x40000000));
build_page(); // append \hbox to \hsize{}\vfill\penalty-'10000000000
}
return false;
}