Remaining things to remove mostly per the 80/20 rule hga@xxxxxx (11 Aug 2019 14:35 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (11 Aug 2019 15:10 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (11 Aug 2019 15:15 UTC)
gecos parser implementation Lassi Kortela (11 Aug 2019 17:30 UTC)
Re: gecos parser implementation John Cowan (12 Aug 2019 04:07 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (12 Aug 2019 12:02 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (12 Aug 2019 11:52 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (12 Aug 2019 12:21 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (12 Aug 2019 13:44 UTC)
Timezone files Lassi Kortela (12 Aug 2019 14:00 UTC)
GECOS field parsing Lassi Kortela (17 Aug 2019 08:52 UTC)
Re: GECOS field parsing Lassi Kortela (17 Aug 2019 09:11 UTC)
Re: GECOS field parsing Lassi Kortela (17 Aug 2019 09:16 UTC)
Re: GECOS field parsing Lassi Kortela (17 Aug 2019 09:35 UTC)
Re: GECOS field parsing Lassi Kortela (17 Aug 2019 09:56 UTC)
Re: Remaining things to remove mostly per the 80/20 rule Lassi Kortela (12 Aug 2019 12:39 UTC)

Re: GECOS field parsing Lassi Kortela 17 Aug 2019 09:55 UTC

Check out the Solaris finger client:
<https://github.com/kofemann/opensolaris/blob/80192cd83bf665e708269dae856f9145f7190f74/usr/src/cmd/cmd-inet/usr.bin/finger.c>.
I hereby reproduce the relevant code comment in its full glory. It has
to be seen to be believed. Someone, somewhere earned their job.

----------------------------------------------------------------------

/*
  * The grammar of the pw_gecos field is sufficiently complex that the
  * best way to parse it is by using an explicit finite-state machine,
  * in which a table defines the rules of interpretation.
  *
  * Some special rules are necessary to handle the fact that names
  * may contain certain punctuation characters.  At this writing,
  * the possible punctuation characters are '.', '-', and '_'.
  *
  * Other rules are needed to account for characters that require special
  * processing when they appear in the pw_gecos field.  At present, there
  * are three such characters, with these default values and effects:
  *
  *    gecos_ignore_c   '*'    This character is ignored.
  *    gecos_sep_c      ','    Delimits displayed and nondisplayed contents.
  *    gecos_samename   '&'    Copies the login name into the output.
  *
  * As the program examines each successive character in the returned
  * pw_gecos value, it fetches (from the table) the FSM rule applicable
  * for that character in the current machine state, and thus determines
  * the next state.
  *
  * The possible states are:
  *    S0 start
  *    S1 in a word
  *    S2 not in a word
  *    S3 copy login name into output
  *    S4 end of GECOS field
  *
  * Here follows a depiction of the state transitions.
  *
  *
  *              gecos_ignore_c OR isspace OR any other character
  *                  +--+
  *                  |  |
  *                  |  V
  *                 +-----+
  *    NULL OR      | S0  |  isalpha OR isdigit
  * +---------------|start|------------------------+
  * |  gecos_sep_c  +-----+                        |     isalpha OR isdigit
  * |                |  |                          |
+---------------------+
  * |                |  |                          |   | OR '.' '-' '_'
     |
  * |                |  |isspace                   |   |
     |
  * |                |  +-------+                  V   V
     |
  * |                |          |              +-----------+
     |
  * |                |          |              |    S1     |<--+
     |
  * |                |          |              | in a word |   | isalpha
OR  |
  * |                |          |              +-----------+   | isdigit
OR  |
  * |                |          |               |  |  |  |     | '.' '-'
'_' |
  * |                |    +----- ---------------+  |  |  +-----+
     |
  * |                |    |     |                  |  |
     |
  * |                |    |     |   gecos_ignore_c |  |
     |
  * |                |    |     |   isspace        |  |
     |
  * |                |    |     |   ispunct/other  |  |
     |
  * |                |    |     |   any other char |  |
     |
  * |                |    |     |  +---------------+  |
     |
  * |                |    |     |  |                  |NULL OR
gecos_sep_c   |
  * |                |    |     |  |
+------------------+   |
  * |  gecos_samename|    |     V  V
|   |
  * |  +-------------+    |    +---------------+
|   |
  * |  |                  |    |       S2      | isspace OR '.' '-' '_'
|   |
  * |  |  gecos_samename  |    | not in a word |<---------------------+
|   |
  * |  |  +---------------+    +---------------+ OR gecos_ignore_c    |
|   |
  * |  |  |                        |    ^  |  |  OR ispunct OR other  |
|   |
  * |  |  |                        |    |  |  |                       |
|   |
  * |  |  |  gecos_samename        |    |  |  +-----------------------+
|   |
  * |  |  |  +---------------------+    |  |
|   |
  * |  |  |  |                          |  |
|   |
  * |  |  |  |            gecos_ignore_c|  | NULL OR gecos_sep_c
|   |
  * |  |  |  |            gecos_samename|  +-----------------------+
|   |
  * |  |  |  |            ispunct/other |                          |
|   |
  * |  V  V  V            isspace       |                          |
|   |
  * | +-----------------+ any other char|                          |
|   |
  * | |      S3         |---------------+  isalpha OR isdigit OR   |
|   |
  * | |insert login name|------------------------------------------
----- ---+
  * | +-----------------+                  '.' '-' '_'             |     |
  * |                |    NULL OR gecos_sep_c                      |     |
  * |                +------------------------------------------+  |     |
  * |                                                           |  |     |
  * |                                                           V  V     V
  * |                                                         +------------+
  * | NULL OR gecos_sep_c                                     |     S4     |
  * +-------------------------------------------------------->|end of
gecos|<--+
  *
+------------+   |
  *
| all |
  *
+-----+
  *
  *
  *  The transitions from the above diagram are summarized in
  *  the following table of target states, which is implemented
  *  in code as the gecos_fsm array.
  *
  * Input:
  *        +--gecos_ignore_c
  *        |    +--gecos_sep_c
  *        |    |    +--gecos_samename
  *        |    |    |    +--isalpha
  *        |    |    |    |    +--isdigit
  *        |    |    |    |    |      +--isspace
  *        |    |    |    |    |      |    +--punctuation possible in name
  *        |    |    |    |    |      |    |    +--other punctuation
  *        |    |    |    |    |      |    |    |    +--NULL character
  *        |    |    |    |    |      |    |    |    |    +--any other
character
  *        |    |    |    |    |      |    |    |    |    |
  *        V    V    V    V    V      V    V    V    V    V
  * From: ---------------------------------------------------
  * S0   | S0 | S4 | S3 | S1 | S1 |   S0 | S1 | S2 | S4 | S0 |
  * S1   | S2 | S4 | S3 | S1 | S1 |   S2 | S1 | S2 | S4 | S2 |
  * S2   | S2 | S4 | S3 | S1 | S1 |   S2 | S2 | S2 | S4 | S2 |
  * S3   | S2 | S4 | S2 | S1 | S1 |   S2 | S1 | S2 | S4 | S2 |
  * S4   | S4 | S4 | S4 | S4 | S4 |   S4 | S4 | S4 | S4 | S4 |
  *
  */