





                                                                        1


                          A First Look at KSOS-32 Performance

                                      Tom Perrine
                                        Logicon
                               Operating Systems Division
                                      San Diego CA

                                       May 3 1987





               _1.  _I_n_t_r_o_d_u_c_t_i_o_n

               This report will discuss the performance of KSOS-32,  com-
               pared  with  4.2BSD  UNIX  running  on  the same hardware.
               Specifically, differences  in  CPU-bound  process  perfor-
               mance, file system operations, and process creation speeds
               will be analyzed and explained.  Suggestions  for  perfor-
               mance  enhancements  will be presented with approximations
               of expected performance gains.


               _2.  _B_a_c_k_g_r_o_u_n_d

               As of mid-April, 1988, the  KSOS-32  Security  Kernel  was
               fully  functional  on  VAX-11/780  hardware. As this was a
               significant milestone in the KSOS-32 project, a review was
               deemed necessary to adequately assess the current state of
               the Kernel and to suggest where further development effort
               (not currently funded) could produce the best results. The
               primary goal of further KSOS-32 Kernel  development  would
               be  increased  performance  of  the  base system with some
               functional enhancements  for  improved  security,  notably
               Access  Control Lists.  This report is concerned only with
               the performance issues.

               KSOS-32 is essentially a port of KSOS-11 to VAX  hardware.
               KSOS-11  currently runs on DEC PDP-11/70 systems. The per-
               formance of KSOS-11 is described in [Perr84].

               The current state of KSOS-32 is very similar to  the  very
               first release of Berkeley Software Distribution (BSD) UNIX
               for the VAX (3 BSD), especially in  the  following  areas.
               KSOS-32  does  not  yet take full advantage of the virtual
               memory architecture of the VAX hardware, e.g.  it does not
               page.  KSOS-32  still has a file system based on Version 6
               UNIX, BSD UNIX was based on a Version 7 UNIX  file  system
               until  recently  (4.1c  or 4.2 BSD in 1981). The Version 6
               and Version 7 UNIX file systems are  almost  identical  in












                                                                        2


               structure and performance.

               This similarity between 3 BSD UNIX  and  KSOS-32  will  be
               explored  further  in the Summary section. In this report,
               whenever the term UNIX is alone used, it refers to version
               4.2 BSD UNIX, as distributed by Berkeley.


               _3.  _B_e_n_c_h_m_a_r_k _p_r_o_g_r_a_m_s _a_n_d _e_n_v_i_r_o_n_m_e_n_t

               It has often been said that there are "lies,  damned  lies
               and  benchmarks." With this in mind, the following results
               are intended to guide further KSOS-32 Kernel  development,
               not as an absolute measure of KSOS-32 performance.


               _3._1.  _T_h_e _P_r_o_g_r_a_m_s

               In order to measure the relative performance of KSOS-32 on
               the VAX hardware, the following programs were written:


               *    prime - calculate prime numbers (CPU-bound);

               *    openit - file create/close;

               *    rw - write to a file and then read it;

               *    copy - copy a file;

               *    forkit - create several subprocesses.

               These programs were run under KSOS-32 and  4.2  BSD  UNIX.
               The results are summarized in Table 1.

               All of the measurements are considered to be  accurate  to
               the  nearest second, as this is the resolution of the com-
               mon system clock under BSD UNIX. (The _g_e_t_t_i_m_e_o_f_d_a_y  system
               call  is  not  portable to System V, where further perfor-
               mance comparisons will be performed later. The KSOS  clock
               presented  timings  to the nearest 0.10 second, which were
               rounded.)  The clock resolution is much smaller  than  the
               performance ratios being measured and is not considered to
               be an issue at this time.

               The source code of the benchmark programs is  included  in
               the Appendix.



9

9









                                                                        3



                         ________________________________________________


                                        KSOS         4.2 BSD     RATIO
                                        ----         -------     -----
                         prime           178             178      1:1
                         openit          352               8     44:1
                         rw               54               3     18:1
                         copy             57               3     19:1
                         forkit          832              10     83:1


                              Table 1. Performance Results
                         ________________________________________________


               _3._2.  _T_h_e _E_n_v_i_r_o_n_m_e_n_t

               All of the benchmark programs were run  on  the  same  DEC
               VAX-11/780  system. The system is configured with 6 Mbytes
               of main memory and two MASSBUS RM05 removable disk drives.
               One  of the drives always has the UNIX system disk and the
               other always has the KSOS system  disk.  The  KSOS  system
               pack  was  contructed  from  a  PDP-11  KSOS pack by using
               several specially-written conversion  programs.  The  UNIX
               file system was approximately 87% full and the file system
               on the KSOS pack was about 70% full.

               The benchmarks were written in Modula-2 because  there  is
               no  KSOS  Security  Kernel  system  call interface for any
               other language at this time. The Modula-2 compiler  builds
               object  files  that  are  compatible with 4.2 BSD UNIX and
               KSOS-32 (which uses the same object module  format).  How-
               ever,  because System V UNIX can not run these object pro-
               grams, the benchmarks have not yet been run  on  System  V
               UNIX.


               _4.  _A_n_a_l_y_s_i_s

               The performance of each of the systems as measured by  the
               benchmark programs will be discussed in this section. Pre-
               viously planned and newly suggested enhancements to  KSOS-
               32  that  address each performance area will be referenced
               and explained.




9

9









                                                                        4


               _4._1.  _P_r_i_m_e

               This program uses a fairly inefficient method  of  repeat-
               edly  finding  prime  numbers  using the Pythagorean Sieve
               method. It is completely CPU-bound, doing  no  I/O  during
               its main processing loop.

               The results show  that  for  CPU-bound  programs,  KSOS-32
               offers  performance  that  is  equal  to  UNIX. This is an
               important result, because it allows a good  prediction  of
               the  performance  of a CPU-bound task under KSOS-32 on any
               VAX processor that runs UNIX and that the security mechan-
               isms  of KSOS impose no additional overhead to a CPU-bound
               program. A task under KSOS-32 will perform nearly  identi-
               cally  with the same task under UNIX, on the same VAX pro-
               cessor model.

               This allows the general statement that  a  CPU-bound  task
               under  KSOS-32 on a VAX will perform faster than any other
               hardware and software  security-kernel  based  system,  if
               UNIX on the same VAX is faster than that system.

               Any statement that can be made about a  CPU-bound  program
               should  be  applicable to the CPU-bound phases of any pro-
               gram.

               The current KSOS-32 process scheduler is designed to  give
               good  response  time  to interactive processes in a "fair"
               manner. It is very similar in intent and behavior  to  the
               UNIX  process scheduler. Other scheduler designs have been
               proposed for KSOS-32 that would allow  better  support  to
               "real-time" processes that require consistent, predictable
               response to interrupts and other events. Such a  scheduler
               could  be  easily added to KSOS-32. This is in contrast to
               UNIX, which would require considerable work.


               _4._2.  _O_p_e_n_i_t

               This program repeatedly creates and  closes  a  file.  The
               create  operation  implicitly  opens the file for writing,
               but no file I/O is performed.

               Table 1 shows that KSOS-32 is much slower  that  UNIX  for
               file  creations.   Examination  of KSOS-32 and UNIX source
               code reveals that the close operation  is  very  fast  for
               both systems and is certainly negligible compared with the
               create operation. Therefore this program is really a meas-
               ure  of  the  file  creation  operation,  which  for  UNIX
               involves finding an  available  "inode"  and  for  KSOS-32
               involves  finding  a available  "jnode". (These structures












                                                                        5


               serve identical purposes and are very  similar  in  struc-
               ture.)

               This large performance difference is not unexpected as the
               4.2  BSD UNIX file system is the Berkeley Fast File System
               (FFS), while the KSOS-32 file system is  very  similar  in
               disk  layout  and  performance  to the Version 7 UNIX file
               system.  The  performance  of  the  FFS  is  discussed  in
               [McKu83]. Due to the organization of the on-disk inode and
               free space areas in the FFS, file creations are very fast.

               Further examination of the KSOS-32 activity in  this  area
               would  show  that KSOS-32 is searching for a free "jnode".
               This search takes place by searching an on-disk structure,
               which   requires   I/O  (to  read  the  disk)  and  is  an
               Order(linear) search process.

               Comparing KSOS-32 to a version of  System  V  UNIX  or  an
               older version of BSD UNIX (pre 4.2) which does not use the
               FFS would be more appropriate, as these  systems  used  an
               on-disk organization which is similar to KSOS-32. However,
               the benchmark programs would not run on System V UNIX  and
               older BSD systems are extremely hard to find (because they
               performed very similarly to KSOS-32!).

               Suggested  file  system  enhancements  will  be  discussed
               together in the "Recommendations" section below.


               _4._3.  _R_W

                    The "RW" program  creates  a  file,  writes  a  large
               number of blocks to the file and reads them back. Only the
               writing and reading are timed.  This  benchmark  gives  an
               indication of the performance of the disk block allocation
               algorithm, as well as device I/O performance.

               Table 1 shows that KSOS-32 is about an order of  magnitude
               slower  that  UNIX  for  writing and reading. This is most
               likely due to the speed of the KSOS-32 device drivers, the
               fact  that the Kernel is doing no buffering and the slower
               search for available disk blocks. The search for available
               disk  blocks  requires  I/O to search the on-disk bit map,
               which further slows the system due to disk  arm  movement,
               exacerbated  by  the  existing  disk device drivers.  This
               search is faster than the Version 6 UNIX file system  free
               block serach, but slower than the FFS search.

               Possible enhancements  to  the  KSOS-32  file  system  are
               described  below.   Implementation  of these would improve
               this benchmark to roughly the same as UNIX performance.












                                                                        6


               _4._4.  _C_o_p_y

                    The "Copy" benchmark is  very  similar  to  the  "RW"
               benchmark except that the source for the blocks is a file.
               This program copies the large file that was created in the
               "RW" benchmark to another file.

               The relative performance figrures show a very small degra-
               dation  from the "RW" benchmark. This is mostly due to the
               additional disk arm  movements  required  as  a  block  is
               alternately read from one file and then written to another
               file.



               _4._5.  _F_o_r_k_i_t

                    This benchmark program tests process creation perfor-
               mance. In BSD UNIX process creations are implemented via a
               "fork," wherein the parent process is "cloned." This clon-
               ing  includes  completely  duplicating  the virtual memory
               space of the parent, as well as  the  state  of  any  open
               files,  etc.  At this point the child and parent processes
               continue to execute, but typically  take  different  paths
               through  the  program logic.  Often the child will immedi-
               ately "exec" to load a new program image,  discarding  all
               of  the  context  that was copied from the parent process.
               KSOS-32 provides the "fork" and "exec" functionality,  but
               also  provides  a single step "spawn" which does not waste
               time copying  the  parent's  virtual  address  space,  but
               immediately  loads  a  new program image. This feature was
               not utilized by this benchmark program.

               For the "fork" operation, KSOS-32 is much slower than  BSD
               UNIX.   Again  this  is  no  surprise.  This is due to BSD
               UNIX's much better use of the VAX virtual  memory  facili-
               ties  to  support  the fork operation.  Specifically, UNIX
               initially duplicates only the page  table  of  the  parent
               process  and uses a "copy-on-write" algorithm. KSOS-32, on
               the other hand, copies all of the contents of  the  memory
               segments at "fork" time.

               "Copy-on-write" is implemented by UNIX  in  the  following
               manner.  At  fork  time,  the  page table of the parent is
               copied into the page table of  the  child.  Then,  in  the
               child's  page  table, all of the writable pages are marked
               as 2"copy-on-write", meaning that until the child tries to
               write  to  the  page, they will be shared with the parent.
               When the child process does try to write to the  page,  it
               will be copied into a new page allocated for the child and
               the child's page table will be modified to  point  to  the












                                                                        7


               new  private page instead of the shared page. This can cut
               the time required to duplicate the virtual  address  space
               during the fork by a tremendous amount.

               Specific recommendations for speeding up the  fork  opear-
               tion will be discussed in the next section.

               _" .    ds |n Recommendation"

               _5.

                    This  section  describes  some  of  the   performance
               enhancements which should be implemented in KSOS-32.


               _5._1.  _F_i_l_e _S_y_s_t_e_m _E_n_h_a_n_c_e_m_e_n_t_s

                    Some of the planned enhancements to the KSOS-32  file
               system  are  briefly mentioned in [KSOS85]. These and some
               new ideas will be discussed here.

               The desirable file system enhancements  include  improving
               the device drivers for the disk devices, data buffering in
               the Kernel and changing the organization  of  the  on-disk
               structures.

               The current disk device drivers handle all I/O requests on
               a  first-in-first-out basis. This could easily be replaced
               by any of the algorithms for request  ordering  that  have
               been  described  over  the  last few years. This obviously
               would include seek ordering, rotational  latency  calcula-
               tions   and   block  pre-fetching.  In  addition,  KSOS-32
               currently only allows one I/O request to  be  pending  for
               any  given  device.  This is entirely a hold over from the
               PDP-11 version of  the  system  which  encountered  memory
               space limitations which prohibited the addition of any new
               code to the Kernel. There is no such limitation in KSOS-32
               and changing this would give a large performance boost.

               The Kernel as currently implemented does no data buffering
               in the Kernel. All data is transferred to and from buffers
               in the user's virtual address space. There is  no  caching
               of  user data blocks. This means that several requests for
               the same data block, even by the same user, would  require
               several  physical  I/O  operations.  This limitation stems
               from the original philosphy of KSOS-11 design, wherein the
               UNIX  Emulator  (which no longer exists) would buffer user
               data. Having the Kernel buffer data blocks for both  read-
               ing and writing would also gain a large performance boost.

9

9









                                                                        8


               The on-disk structures of KSOS-32  are  exactly  those  of
               KSOS-11.  The  KSOS-11  file system is very similar to the
               Version 6 UNIX file system,  although  much  more  robust.
               However,  there is still the notion of an index area and a
               data block area. Because reading a file requires accessing
               both  the  index  area and the data block area, there is a
               considerable amount of disk arm movement. Given the design
               of  the  existing  I/O  system, this is very slow. This is
               almost exactly the situation that lead Berkeley to develop
               the  Fast  File  System  (FFS).  KSOS-32 could essentially
               re-implement the FFS and gain  the  dramatic  improvements
               that  this  file system brought to 4.2 BSD UNIX. If a FFS-
               style file system were implemented on  KSOS-32,  the  only
               differences  would be a small amount of overhead (<<5%) at
               file open or creation time due to  some  small  amount  of
               security checking.


               _5._2.  _V_i_r_t_u_a_l _M_e_m_o_r_y _E_n_h_a_n_c_e_m_e_n_t_s

                    KSOS-32 could  and  should  use  the  "copy-on-write"
               approach  for  copying  virtual  address spaces during the
               fork operation. This would  also  speed  up  the  combined
               "spawn"  operation.  This  change would provide a dramatic
               performance increase. However, KSOS-32  process  creations
               will  always  be  at least slightly slower because KSOS-32
               process are slightly more complex and contain more context
               which  must  be  duplicated. At best, this could lead to a
               less than 10% difference in performance, compared with BSD
               UNIX.



               _6.  _S_u_m_m_a_r_y

               KSOS-32 is now at a critical turning point in its develop-
               ment.  It  is running on a VAX-11/780 and could run on any
               processor in the VAX architecture family with  a  develop-
               ment  effort  of less than 6 man-months. KSOS-32 is a very
               secure system and has the potential to  reach  any  rating
               from  the  Trusted  Computer  System  Evaluation  Criteria
               [TCSEC], although B3 would  be  the  best  target  rating.
               Estimates  of  the  effort required to reach any rating in
               the TCSEC have been identified.  KSOS-32 is very UNIX-like
               and  this would allow it to support a secure UNIX environ-
               ment for application programs. The performance of  KSOS-32
               is  far  from its potential, primarily because at no point
               in its history has the project ever been funded to  inves-
               tigate performance issues or tune the system.

9

9









                                                                        9


               For all of these reasons, KSOS-32 is at a  point  that  it
               could  become  a  very  cost-effective platform to support
               multi-level secure applications on a  hardware  base  that
               spans  a  tremendous  performance range (from the microVAX
               processors up to any of the large VAX uniprocessors).

               It is intersting to note that none of KSOS-32 current per-
               formance  deficiencies  arise as a result of the architec-
               ture or the security features of the system. They are  all
               due  to  incomplete implementation or lack of tuning. This
               should  not  be  surprising,  as  KSOS-11  was  originally
               intended  as a production prototype for secure systems and
               KSOS-32 has never been tuned.

               Instead of focusing on KSOS-32's existing performance,  we
               should focus on KSOS-32's potential performance. Berkely's
               4.2 BSD UNIX is a fairly well-tuned  and  mature  product,
               and  it  should  be  noted  that KSOS-32's performance and
               implementation is very similar  to  that  of  the  initial
               release of BSD UNIX (3 BSD). KSOS-32 development is now at
               almost exactly the same place Berkely UNIX  was  in  1979.
               KSOS-32  has  the  same potential for performance, given a
               modest investment in time and money.



























9

9









                                                                       10


               _7.  _R_E_F_E_R_E_N_C_E_S



               [TCSEC]
                    Dod 5200.28-STD, Department of Defense  Trusted  Com-
                    puter System Evaluation Criteria, 15 August 1985.



               [KSOS85]
                    Logicon, Kernelized Secure Opearting  System  (KSOS):
                    Migration to VAX Architecture, Feasibility Study.



               [McKu83]
                    McKusick, et. al., "A Fast File System for UNIX",  in
                    Volume  2c  of  The  UNIX  Programmers Manual, August
                    1983.



               [Perr84]
                    Perrine, et. al.,  "An  Overview  of  the  Kernelized
                    Secure  Operating  System  (KSOS)", in Proceedings of
                    the Seventh DoD/NBS Compueter Security Conference.























9

9



