Writing this program was great fun.  I thought it would be so 
simple, but there were surprises.
  
  1.  If two threads both call rand() repeatedly, they each see 
exactly the same sequence of values returned.  My solution was to 
have only MAIN call rand();  MAIN then provides a random value to 
each of the threads.

  2.  I found it to be convenient to change the assignment to 
have one thread be MAIN, and to have MAIN be a master controlling 
each of the slave threads.
  
  3.  The assignment was to allow an arbitrary number of threads. 
The assignment was also to allow many different disciplines.  The 
first requirement suggests that all threads will be identical. 
The second requirement implies that each thread will have a 
distinctly assigned behavior.  Not easy to reconcile these two 
requirements.
  
  4.  The accounting for the report was tedious.
  
  5.  When I first ran 1000 iterations, the program completed and 
then immediately the machine shut down due to overheating.  Here 
are the results.  Note that threads which followed a discipline 
of choosing randomly had consistently lower scores than any of 
the other six disciplines.  Why should this be so?  If I were 
inclined to look for further bugs (which I am not), this is the 
place I would start.

    Results   

    Thread     Discipline         Total   

         0     ALWAYS_ROCK         3949  
         1     ALWAYS_RANDOM       3637  
         2     ALWAYS_PAPER        4053  
         3     ALWAYS_RANDOM       3727  
         4     ALWAYS_SCISSORS     3997  
         5     ALWAYS_RANDOM       3679  
         6     ALWAYS_MOST         4011  
         7     ALWAYS_RANDOM       3640  
         8     ALWAYS_MIDDLE       4052  
         9     ALWAYS_RANDOM       3667  
        10     ALWAYS_LEAST        3938  
        11     ALWAYS_RANDOM       3612  

  6.  This program uses only spin locks to communicate between 
threads.  How does this work?

  Consider threads t0 and t1 which progress in lockstep via the 
following code:

        t0                 t1
   x = 0;
   while (1);           while (1);
   {                    {
     while (x == 0){};    while (x == 1){};
     other0();            other1();
     x = 0;               x = 1;
   }                    }

  X is written by one thread and then is written by the other 
thread only when it has seen the write from the first thread. 

  7.  The actual communication mechanism between MAIN and each 
thread.

  There is a field, called THROW, for each thread.  MAIN 
initializes THROW to zero (for each thread, but ignore that now).
MAIN then waits for THROW to become nonzero.  

  Each thread is initialized to expect THROW to be zero.  When it 
sees that THROW is zero, the thread puts ROCK (=1), PAPER (=2), 
or SCISSORS (=3) in THROW.  Then the thread waits for THROW to 
become equal either to zero or to END (=4).

  MAIN sees THROW is nonzero, and so records statistics on the 
thread's throw.  If all iterations have completed, MAIN puts END 
in THROW.  Otherwise it puts zero in THROW and waits for ROCK, 
PAPER, or SCISSORS from the thread.  

  When a thread sees THROW set to END, it sets THROW to ENDED 
and then terminates.

  When MAIN sees ENDED from all threads, it prints the results of 
the run and then terminates.

  8.  Here is the logic in more detail.

    The field s[i][THROW] plays the role of X above.  MAIN sets
  s[i][THROW] to NONE to tell thred to THROW (or MAIN sets it to
  END to tell thred to terminate).  When thred sees NONE, it
  throws ROCK, PAPER, or SCISSORS.  When it sees END, it terminates.

    Here is a very simplified interaction.  If you understand this,
  you will understand the program.  If not, not.

    MAIN               
    THROW = NONE   // Initial value for each thread.
    Create threads.
    Begin loop of N iterations.
      For each thread spin while THROW == NONE.
      // at this point all threads have thrown ROCK, PAPER, or SCISSORS.
      Accumulate results of the iteration.
      Set THROW = NONE for each thread to allow each thread to run again.
  End loop of N iterations.
    For each thread set THROW = END.
    For each thread spin until THROW = ENDED.
    Print out accumulated statistics.
    return from MAIN.

                       THRED
               while (1)                    // Loop until
                           if THROW == END break      // MAIN says to quit, or
                           if THROW != NONE continue  // wait for MAIN to say to throw.
                           // Signal MAIN to stop spinning and to read my THROW.
                           Set THROW = ROCK, PAPER, or SCISSORS by whatever algorithm.
                         end while
                         // Come here only after MAIN said END.
                         Set THROW = ENDED  // Tell MAIN I am ending.
                         return from THRED.
     
 
  9.  Why is this method of interaction safe?
  
  Everyone knows that machines are not sequentially consistent 
(SC).  In order for threads to safely access shared operands, 
threads must first set a lock and then, when the access is 
completed, unlock the lock.

  On almost all of today's machines failures to be SC take only 
two forms:  reads occur before writes, or writes occur 
nonatomically.

  A younger read operation can occur before an older write 
operation only if the two operations access different operands. 
Since MAIN and each thread access only one operand between them,
there can be no reordering of operations on the shared operand.

  Each shared operand is written by one thread and read by one 
other.  Hence there can never be the situation where two threads 
read an operand and one sees a new value and the other sees an 
old value, a necessary condition for a failure of multiple copy 
write atomicity to occur.
  
  10.  But . . .?
  
  Having said that, I admit that I will be curious to see the 
results of people possibly running this program on other 
machines.  If anyone sees a hang, please let me know.

  11.  Surprise bug.  I ran this program a bunch of times during 
the debugging process.  There were formating errors, but (after a 
certain point) no timing errors.  Then, when I was convinced that 
everything was complete, a hang happened.  I am not certain of 
the cause.  At one time I thought that the problem lay in the two 
fetches of s[i][THROW] at the beginning of the loop in THRED 
having occurred out of order.  However, after several hours of 
study over several days, I cannot believe that that happened, so 
the cause of the hang remains a mystery.  

  12.  How to run.
  
  At the beginning of RPS.CPP set the DEFINEd variables.
   
  To compile on Windows:
  
#define  SYSTEMID_WINXP_VC2008

  To compile on Linus:
  
#define  SYSTEMID_LINUX

  To make one fetch in THRED:
  
#define  FETCH 1

  To make two fetches in THRED:
  
#define  FETCH 2


    Bill Collier  collier@acm.org