|
|
|
ListProc 8.0 Release Notes
ListProc 8.0 includes a number of new features and improves on the features present in version 7.2. The bulk of these changes are aimed at enhancing ListProc's ability to efficiently deal with large lists, and with large numbers of lists. The sections on multi-threading and general speedups are most pertinent in this regard.
In addition to increased efficiency, ListProc 8.0 offers greater reliability through a number of general enhancements and bug fixes. These are listed toward the end of this document.
A plain text version of these release notes is available on CREN's ftp server.
Contents
New Features
FULLY MULTI-THREADED
ListProc 8.0 is now fully multi-threaded. This allows it to process mail for several lists simultaneously.
- The "threads" config file directive
- A new config file directive "threads" controls the number of threads ListProc will create. Usage of the "threads" directive is as follows:
threads number
where number is the number of threads to use. Valid numbers of threads are between 1 and 256, inclusive. A value of 1 results in the previous, unthreaded behavior.
If no value of "threads" is specified, ListProc defaults to 1 thread.
- Synchronized access to data files
- Accesses to aliases and ignored files have been synchronized for multiple threads. Synchronization is done via the system's semaphore. In previous versions, this was only necessary when ILP was enabled. The use of multiple threads in 8.0 requires the use of semaphores at all times.
- Error mail processing
- As long as the config file defines more than one thread, ListProc will handle error mail for all lists asynchronously with a single thread, rather than dealing with it when processing the list mail. This has the following implications:
- There is at most one thread processing error mail at any one time, regardless of the number of threads defined, and the number of threads still free.
- It is possible for ListProc to concurrently process list mail and error mail for a single list. In this case, the error mail is given lower priority.
- This will significantly speed up the processing of each list, since each list's thread will only need to handle regular list mail and digests.
To top of page
- ListProc's flow of control
- Simultaneous processing is performed on as many lists as there are threads defined in the config file. Hence, if your config file contained "threads 10", ListProc would be able to concurrently process mail from 10 different lists.
The spawning of threads is still subject to delays specified by the"frequency" directive in the config file. For example, suppose yourconfig file contained the line "frequency 0 30". In ListProc 7.X,the flow of list processing would look like this:
process list 1 (regular mail, digests, error mail)
process server mail
process list 2 (regular mail, digests, error mail)
process server mail
...
process final list (regular mail, digests, error mail)
process server mail
wait 30 seconds
repeat
ListProc 8.0 would instead do the following:
spawn list 1 thread (regular mail, digests)
spawn error thread if error mail exists and
no error thread is already running.
spawn server mail thread unless one already exists
spawn list 2 thread (regular mail, digests)
spawn error thread if error mail exists and
no error thread is already running.
spawn server mail thread unless one already exists
...
...
spawn final list thread (regular mail, digests)
spawn error thread if error mail exists and
no error thread is already running.
spawn server mail thread unless one already exists
wait 30 seconds
repeat
Notice that in 8.0 ListProc does not need to wait for the processing of one list to finish before it goes spawns a thread to process the next list.
Within each list, processing is still serialized: regular mail and digests are never handled concurrently for a given list. As noted above, the error processing has been taken out of the list processing loop. Error mail is processed by the first available thread after error mail arrives.
- Replacement of system() calls
- The multi-threaded code required the use of fork() and exec() rather than system(). This also allows us to circumvent the problems with the shell that's spawned by system(). Consequently, this allows us to lift the restriction of no apostrophes in a user address. (see the GENERAL CHANGES section below) All binaries have been converted to make use of the new sysexec() routine that performs the fork(), exec() and wait() functions. All of these required significant rewriting of signal handling routines in serverd.c.
- Updated semset utility
- The semset utility included in the distribution has been updated to show how many lists are delivering mail at the time of invocation. Invoked with no arguments, semset returns basic information about ListProc's interprocess communication (the output has been wrapped to fit an 80 column display):
%semset
IPC status from <RUNNING SYSTEM> as of Thu Feb 8 12:22:48 1996
Message Queue facility not in system.
Shared Memory facility not in system.
T ID KEY MODE OWNER GROUP CREATOR
Semaphores:
s 131072 0x0x00000000 --ra------- server staff root
CGROUP NSEMS OTIME CTIME
staff 1 12:20:26 13:49:31
When semset is called with the semaphore ID as an argument, it returns specific information about that semaphore's current state. The following flag information can be displayed:
SEM_REQ_ID Requesting a tag ID, request #, etc.
SEM_SYSFILES Accessing/Processing global server files
SEM_LISTFILES Accessing/Processing list specific files
SEM_ARCHIVES Accessing/Processing a list archive
SEM_DLVR_MAIL(N) Delivering mail (N = number of lists
currently delivering)
The sample output below shows ListProc's processing on our serverover the course of about 30 seconds:
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x14; flags on: SEM_LISTFILES SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x20; flags on: SEM_DLVR_MAIL(2x)
% semset 131072
Current value: 0x20; flags on: SEM_DLVR_MAIL(2x)
% semset 131072
Current value: 0x20; flags on: SEM_DLVR_MAIL(2x)
% semset 131072
Current value: 0x24; flags on: SEM_LISTFILES SEM_DLVR_MAIL(2x)
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
% semset 131072
Current value: 0x10; flags on: SEM_DLVR_MAIL(1x)
- Multi-threading caveats and cautions
- Multi-threading can potentially create a much larger drain on your system's resources. If the system-wide process table is not large enough, you may run out of available processes. Furthermore, each thread requires its own memory and makes separate demands on processor and disk resources.
Perhaps most significantly, each thread opens its own sendmail processes. (The number of sendmail processes for each list depends on the number specified by the "multiple_recipients" directive in the config file. For example, if "multiple_recipients" is set to M, a list with S subscribers will spawn S/M sendmail processes.) Increasing the number of lists processing at one time will cause a proportionally larger number of sendmail processes to run.
An additional caveat with multi-threaded processing is that serverd must wait for all outstanding threads to quit before it can re-initialize. (This is essentially also what happens with 7.x - it is just that with only one thread running, the wait time will probably not be very long.) Once the server receives a notification that it must re-initialize, no new threads are spawned while serverd waits for the existing threads to complete operation. Once all threads have finished, serverd can rebuild its internal data structures.
To top of page
- The LISTS command is slow because it must check each hidden list to see if the requester is either an owner or a subscriber. The new routine is 50% faster than the previous version.
- Checking whether a user is an owner is now done in memory; this speeds up the LISTS request as well as all list-specific requests.
- The speed of subscriber lookups has been increased between 30 and 100 times, depending on the request. Some of the changes that allow this increase required the removal of comment support for the subscribers file.
- Additional speed up of QUERY requests by eliminating system calls.
- The SET request has been sped up by a factor of 25. The user's entry is overwritten with blanks and a new entry is placed at the end of the file. Blank entries are cleaned up once a day, when the list is sorted.
- Sped up the error analysis routine by overwriting subscriber entries to be removed with blanks.
- Sped up look for alternate addresses when a user is not subscribed. The search time was cut in half.
- Small performance enhancement in list.c by keeping track of internal unprocessed information (messages, and file offsets).
- Sped up the following requests while in interactive, casual user mode: LISTS, GET, INDEX, SEARCH, REVIEW, STATS. Since casual users do not provide an email address, they are never authorized to recieve information about private lists. Thus, ListProc no longer checks the subscribers file of hidden lists to see if casual users are subscribed.
- Sped up SET ... PREF requests by avoiding subscriber look up. Since this command is only valid for list owners, the subscriber file look up is unnecessary.
To top of page
ListProc now allows the option of calling an external accounting program each time list digests and regular list mail are disributed. The accounting program is NOT invoked when error messages are sent out. The environment variable ULISTPROC_LIST_ACCOUNTING_PROG specifies the full path to the accounting program to use. (If this variable is not set, no accounting program is called.)
The accounting program is passed the following information:
- the list alias
- the value of LPDIR
- the total number of regular mail recipients (including peers)
- the total number of digest recipients
- the total number of newsgroup postings
- the filename that contains the headers of the message distributed
- the filename that contains the body of the message distributed
The accounting program is responsible for calculating file sizes, keeping track of the date and time, running totals, etc. There will be further need in the future for more accounting information, like who accesses list archives, who gets what and how big the files are, etc. These will be addressed on a need-to-have basis.
To top of page
- Loop detection has been enhanced to deal with an additional type of loop. Consider the following scenario:
An error message comes in for a user, and a notification that it was ignored is sent to the list owner. If the message to the list owner bounces, ListProc will receive a bounced mail message. This will create another error message, which is again sent to the owner. This problem was not caught by the present loop detection system, since the returned error messages kept growing.
To solve this problem, ListProc now looks for Message-ID: and X-Listprocessor-Version: header lines in the body *if* the message is sent to the errors folder. When a message id is found that has already been processed, the message is flushed. Since the problem could well be the owner's address, the owner is NOT notified, even if CCIGNORE is set.
If X-Listprocessor-Version: is found in the body, the owner WILL be notified if CCIGNORE is set, unless a message id is found that has already been processed.
- ListProc 8.0 no longer performs loop-detection on the Reply-To: header of messages. These checks were unnecessary to catch mail loops, and they inhibited legitmate use of the "Reply-To:" to force replies to individual messages to be sent to a specified list or lists.
To top of page
The "header" config directive has been expanded to affect all lists and the server, and to allow for header lines to not be propagated, as follows:
header * {
Headerline:
!Headerline:
...
}
By specifying a * all lists and the server are affected. Header lines to be propagated are listed as before, and header lines not to be propagated should start with an exclamation point ("!"). Headers propagated (or not propagated) to individual lists can be controlled by using a list name instead of the * as follows:
header foo-list {
Headerline:
!Headerline:
...
}
Notice that "header lines" are actually regular expressions as before, so, for example, the following is valid:
header * {
!X-.+-expire:
}
The following header lines will be preserved regardless of the header directive:
From: Reply-To:
Resent-From: Date:
Sender: Control:
Resent-Sender: Approved:
Message-Id: Archive-Name:
Resent-Message-Id: all MIME headers
Example: here is how to avoid propagating header lines that cause loops and error messages with auto-responders:
header * {
!Registered-Mail-Reply-Requested-By:
!X-Confirm-Reading-To:
}
If list-specific "header" directives are also defined, these areappended to the list inherited by "header *".
To top of page
- The system manager is now required to use the system password to issue the following requests:
EDIT, PUT, HOLD, FREE, LOCK, UNLOCK, CONFIGURATION
Moreover, ARCHIVE operations CREATE, MODIFY, REMOVE will work with the manager password only.
Without these restrictions, it is possible for a list owner to fake email from the system manager, and use the list password and get around restrictions posed on owners for the above requests.
- Review now shows REFLECTOR setting.
- CONFIGURE is a new alias for CONFIGURATION.
- ListProc-generated Subject: lines on outgoing server messages have been limited to 132 characters in length.
- The ILP time-out has been changed from an absolute time-out to an idle time-out. The idle time-out is specified with the -i flag to serverd. For example "serverd -i 180" enables interactive sessions with an idle time-out of 3 minutes. The implementation is as follows:
- A fixed 60 second login time-out is enforced.
- Upon login, the specified idle time-out is enforced.
- Every request (including just hitting return) except TIMELEFT resets the clock.
- When a connection is older than 24 hours it will be preempted if all ports are busy and a new connection request comes in. (Note that this time period is presently hard coded.)
- Catmail now exits with code 75 when files cannot be opened or locked. This forces the MTA (e.g. sendmail) process on the sender's side to queue the message, and try again later. The site manager is sent a notification message when this occurs.
- Subscription managers (as well as list owners) can now EDIT and PUT subscribers and aliases files.
- Digests are now distributed in MIME format. MIME headers for individual messages are also preserved. In previous versions MIME encoded messages could not be read in digest form. These changes solve that problem.
- The "queued" utility has been modified to make it easy to kill. The "start" program has also been changed so that it kills "queued" when appropriate.
- Approval requests to moderators are now sent in MIME format. With previous versions, moderators were forced to pipe MIME messages to munpack or similar utilities in order to read them, with mixed success. The new behavior is as follows:
- If the list is set to MODERATED-NO-EDIT, a MIME encoded approval request is sent even if the message to be considered is not MIME encoded. This is because catmail does not scan the contents of the message.
- If the list is set to MODERATED-EDIT, a MIME approval-request is sent only when the message to be considered is MIME encoded. If the message is not MIME encoded, regular email is sent.
To top of page
In all cases, approved MIME messages are preserved correctly with the correct MIME headers and boundaries. The only exception is with MODERATED-EDIT when the moderator indents the text with ">" or some other string. Since this breaks the MIME encoding, the moderator should NEVER indent segments of MIME encoded messages.
- Added CCSEARCH as a valid manager preference.
- Fixed various typos in the help files.
- Each moderated-no-edit list now has its own tag counter. In pre-8.0 versions of ListProc, this counter was system-wide. Also, the Subject: header line now shows the tag id when sending request for approval of a message.
- Solaris exhibited a problem with ilp. Because BSD signal handling is used and Solaris is SVR4, signals were sent too fast for the process to catch them again, which resulted in ilp being killed (default behavior for SIGIO, SIGURG, and SIGPOLL). Therefore, signal handling in ilp.c, serverd.c and silp.c has been modified so that on SVR3 and SVR4 systems true System V signal handling is used.
- Alias lookups have been modified. In previous versions, aliases were interpreted just as pairs of strings, separated by whitespace. This caused strange problems if one line had one too many or one too few fields, as all the records after this would be shifted over by one.
We now read pairs of expressions per line and report extra garbage. The "dbglpfiles" utility has been modified to do the same as well.
- Apostrophes are now allowed in user addresses. The multi-threading code required getting rid of the use of the system() command. As a side effect, ListProc is no longer reliant on the shell's interpretation of command line parameters. Hence, the security issues involved in allowing apostrophes in user names have disappeared.
- A number of changes have been made to the internal handling of the subscribers, aliases, ignored, and owners files are now handled slightly differently, in order to improve efficiency.
- The subscribers file is now sorted at most once a day, at least 24 hours after the last sort.
- ListProc now handles unsubscriptions differently. The entry to be removed from the subscribers file is simply overwritten with blanks. (It will be removed at the next sorting.)The code for SUBSCRIBE, UNSUBSCRIBE, REVIEW, STATS, SET, and list delivery, and error analysis were changed to make this work.
- The SET request now uses a similar mechanism. The old user entry is overwritten with blanks and a new entry is placed at the end of the file.
- SUBSCRIBE and SET requests will reuse blank entries if the new entry fits in one of them.
- Removal of relevant aliases when a user is unsubscribed is now done using the same "write with blanks" technique which speeds up things a little more. The drawback is that aliases files will now contain blanks in them. Unlike subscribers files, they are not resorted so the blanks will remain until someone removes them by hand.
- SET ... PREF for owners has also been sped up by overwriting entries with blanks and reusing blank entries.
- Removals and additions of owners have been sped up as well by overwriting old entries with blanks, and by using blank entries when there is space for the new entry. Note that the owners file is NOT resorted, so this procedure will create an accumulation of blank lines over time.
To top of page
- The following changes have been made regarding aliases:
- UNSUBSCRIBE, SET ... ADDRESS and auto-deletion no longer remove aliases from $LPDIR/.aliases
- SET ... ADDRESS no longer writes the alias to $LPDIR/.aliases. It continues to append aliases to the list's .aliases file.
Hence aliases work as follows:
- A SET ... ADDRESS request will write to the list's .aliases file only (new).
- An ALIAS request will write to the list's aliases file if issued by the owner, and to $LPDIR/.aliases if issued by the manager (unchanged).
- UNSUBSCRIBE, SET ... ADDRESS and auto-deletion will remove relevant aliases from the list's .aliases only (new).
These changes ensure that the manager has absolute control of $LPDIR/.aliases.
- The OS fault tolerance has been improved in the routines that deal with SUBSCRIBE, UNSUBSCRIBE, and SET requests. If an error occurs, mail is sent to the manager (regardless of his/her error preferences) and the user who issued the command is notified. The error messages and their causes are listed below:
subscribe:
ERROR: subscribe(): stat() failed for subscribers file
CAUSE: the subscribers file disappeared before sorting
RESULT: ListProc exits
ERROR: subscribe(): fatal exit code from system() -- see
.warning file
CAUSE: The shell failed to execute the sorting pipe; system ran
out of swap space, temp space, processes or other
resources
RESULT: user was subscribed, but the subscribers file will remain
unsorted
ERROR: subscribe(): stat() failed for temp subscribers file;
list XXX; user YYY; .subscribers file unsorted
CAUSE: the shell failed to execute the sorting pipe; system ran
out of swap space, temp space, processes or other
resources
RESULT: user was subscribed, but the subscribers file will remain
unsorted
ERROR: subscribe(): Internal error: subscriber file sizes
differ; list XXX; user YYY; .subscribers file unsorted.
CAUSE: sort truncated its output; system ran out of swap space,
temp space, processes or other resources
RESULT: user was subscribed, but the subscribers file will remain
unsorted
ERROR: mv() failed
CAUSE: unknown; see .warning; ListProc exits
RESULT: ListProc exits
unsubscribe:
ERROR: unsubscribe(): stat() failed for subscribers file
CAUSE: the subscribers file disappeared before update
RESULT: ListProc exits
ERROR: unsubscribe(): stat() failed for temp subscribers file;
subscribers file not updated; request not completed; list
XXX; user YYY
CAUSE: system ran out of swap space, temp space, processes or
other resources
RESULT: user remains subscribed and is notified
ERROR: unsubscribe(): Internal error: temp subscribers file
truncated; original intact; request not completed; list
XXX; user YYY
CAUSE: system ran out of swap space, temp space, processes or
other resources
RESULT: user remains subscribed and is notified
ERROR: mv() failed
CAUSE: unknown; see .warning; ListProc exits
RESULT: ListProc exits
set:
ERROR: set(): stat() failed for subscribers file
CAUSE: the subscribers file disappeared before update
RESULT: ListProc exits
ERROR: set(): stat() failed for temp subscribers file;
subscribers file not updated; request not completed; list
XXX; user YYY
CAUSE: system ran out of swap space, temp space, processes or
other resources
RESULT: the new setting did not take effect and the user is
notified
ERROR: set(): Internal error: temp subscribers file
truncated; original intact; request not completed; list
XXX; user YYY
CAUSE: system ran out of swap space, temp space, processes or
other resources
RESULT: the new setting did not take effect and the user is
notified
ERROR: mv() failed
CAUSE: unknown; see .warning; ListProc exits
RESULT: ListProc exits
(NOTE: This change was actually in 7.2, but it a full description never made it into the 7.2 release notes.)
To top of page
To top of page
|
|