Twenty years at Sun

Photo credit: Milton Stephenson.

Here I am, receiving my 20 years award from Brian Hackett at the (very) last UK Systems Practice team meeting in London on May 15th. I chose the Bose iPod dock thingy as my gift, and am really pleased with it. The little pin thingy is quite cute too, but the certificate signed my MLP is another matter entirely!

The bulk of the preceding posts are taken from an internal site I put together as part of my (successful) application to become a Principal Field Technologist. The material highlights some of the fun stuff I’ve done at Sun over twenty years, but it’s just too much detail to include in a CV/resume :)

Lakeland Holistic Performance Workshop

Convinced of the need for more people who share my holistic view of systems performance (not simply people who know some DTrace syntax), last year I organised an autumn mentoring workshop in the Lake District for a dozen hand-picked folk from Sun UK. I hope to do the same again soon, so if you are interested please send me an email introducing yourself and explaining why you think you qualify.

SUPerG, Oracle World, Sun Tech Days, JavaU, CEC, Developer Days, OSUGs, etc

Since my initial “baptism of fire” at the Sun UK User Group / UKUUG meeting all those years ago (see below) I have really gotten into this presenting thing, and to audiences both large and small, internal and external. Here’s some of my more outrageously well received subject matter …

  • Solaris: where innovation happens (my current pitch – see above)
  • Solaris: greater than the sum of its parts
  • DTrace for Dummies (a comedy double act with Jon Haslam)
  • libMicro: we scare because we care
  • A Brief History of Threads (underlining Sun’s leadership in multithreading)
  • Solaris 7: sixty-four reasons to upgrade (that’s bits, stupid!)

Almost an author

My name has made it into a number of books, but has yet to make it onto the front cover. In addition to directing the photo shoot for the covers of the second edition of Solaris Internals (that T2000 prototype was never the same again), and of Solaris Performance and Tools (DTrace is child’s play, even Jon Haslam can do it), I also merited a special mention for contributions to the chapter on the Solaris process model.

To my shame, I have, at times, referred some of my more truculent customers to the “acknowledgments” sections of these, and of Cockcroft’s Sun Performance Tuning (second edition), with a cursory “I taught them all they know, so shut up, and do as I say!” This always has the desired effect (some have even asked me to sign their copy, and if you can find one I haven’t signed, it is worth a fortune)!

The book I don’t tend to mention is boohoo – a dot.com story from concept to catastrophe where I am erroneously credited with the sale of an E10K. My actual advice was “fix your code, because as it is, it won’t scale on an E10K” (but even that wasn’t enough to save them from disaster).

“Ambassador, you’re realling spoiling us!”

I was an OS Ambassador from before Solaris 2.0 shipped, and I still have the golden edition signed by Bill Joy to prove it! In my biased opinion, OS Ambassadors has been the most successful of the ambassador programmes, bringing tangible value to the field and engineering alike. Our conferences became a forum for change, and sometimes served as a watering hole for different engineering groups, working on similar projects in total isolation (we were excellent match-makers).

As my experience and confidence grew, I became more vocal and more of a driver. When folk like Richard McDougall (who can forget his VxVM vs SDS coin?) moved on to other things and the original Ambassador Group Boards were formed, I joined the leadership team. I too moved on when I joined PAE, but I maintained “honorary ambassador” status until returning to the field.

Back in the UK, Chris Gerhard and I started “uk-solaris” as a forum for all with a technical interest in Solaris from various field and engineering roles. At our first meeting we “treated” everyone to Ferror Rocher “chocolates”, which provoked the famous line from one of the cheesiest TV adverts ever (preserved for posterity here).

Putting something back

My first two putbacks into Solaris were a huge learning experience, and my respect for those who do this kind of engineering day in, day out grew immensely. I’d recommend the experience to anyone who needs a better understanding of the process …

  • 4991763 getenv doesn’t scale
  • 5105528 fix for 4915617 breaks simple multiprocess rwlock test case
  • 5105683 fix for 4915617 should be kinder with uncontended shared rwlocks
  • 6209711 thread error detection false positives possible with shared mutexes

libMicro: we scare because we care

In some ways, libMicro was a reaction to LMbench (which Bart Smaalders and I considered unscientific and a pain in the neck), but we really wanted to write a useful tool which could produce compelling data to drive improvements in Solaris. The result has exceeded our expectations dramatically. Not only has libMicro produced data for many “Linux is faster than Solaris at xxx” bugs, but it also kick-started Sun’s interest in the AMD Opteron processor (as well as helping the adoption of SPARC64).

libMicro also has the distinction of being one of the first open source projects at hosted under Mercurial on the opensolaris.org collaboration website. It is still used extensively within Sun, and the code has also proven to be a useful reference for those wanting to write multithreaded applications. Today libMicro can be found alive and well here, and even our competitors are using it!

PRISM and the patent

Before Solaris could have large page support for program text and data, we needed a business case. PRISM stands for Process Relocation in Intimate Shared Memory, and was my first big innovation whilst in PAE. The idea is simple: stop the process, copy a region of small pages somewhere, unmap the source region, remap the source region with large pages, copy the data back, and then allow the process to continue. At the time ISM was the only source of large pages.

My first solution used the LD_PRELOAD shared library interposition technique, but quickly moved on to LD_AUDIT interposition because this provides more fine-grained control. Operating at process startup (with the inclusion of an optional dummy malloc() and free() to preallocate the heap before the relocation took place), PRISM generated plenty of useful data to fuel the MPSS and Large Pages OOB projects. It also highlighted the usefulness of local copies of readonly text and data for large scale NUMA machines.

The PRISM library helped some of our published CPU benchmark numbers, and so had to be shipped with some versions of our compilers. This triggered the patent filing process, with my patent finally being awarded a year or so later.

About five years later, with MPSS and Large Pages OOB in place, I revisited the PRISM idea with Shatter, a tool to break up large pages into smaller ones. This contributed part of Nicolai Kosce’s dataspace profiling initiative (DProfile), which was trying to understand the effect of page colouring on performance.

A brief history of threads

Before joining PAE (Performance and Availabilty Engineering), I worked with a major european database vendor on their kernel scalability (on behalf of a mutual customer, a leading media company). We were fighting limitations in an aging implementation of Sun’s pioneering two-level thread model (something which became known as “old and broken libthread”). During one of my OS Ambassador trips, I visited Bryan Cantrill and Roger Faulker, and discovered that Bryan had sketched and Roger had prototyped a new implementation based on a one-level model. I then used the customer as the business case for introducing the one level “alternate” implementation in Solaris 8 (under /usr/lib/lwp).

By the time I joined PAE, the new implementation had gained quite a reputation for fixing scalability and stability problems with many multithreaded applications. PAE had many fans of the two-level concept, so I found myself immediately in conflict with some of my new colleagues. But I stuck to my guns and was able to win most of them over to the one-level model. I then worked with Roger, Bart Smaalders and others to have make the one-level model the only implementation in Solaris 9. Part of my contribution to this effort was to write the technical whitepaper Multithreading in the Solaris Operating Environment:

  • The original version on www.sun.com [pdf]
  • The revised version as presented at SUPerG [pdf]

This paper has become a widely quoted document of how we do multithreading, and is still relevant today. Of course, the new thread implementation paved the way for Roger’s 1600 file putback to unify the Solaris process model, making threads first class citizens in Solaris – something Linux may actually never achieve!

Education

I don’t consider myself an academic, but I did manage a “Desmond” honours degree in Microelectronics and Computing from the University of Wales, Aberystwyth. It was there that I first fell in love with BSD UNIX (on a VAX 11/750), and there that I saw my first Sun workstation (a 2/120, although I was never allowed to use it).

I have since maintained an active interest in the education market, because I feel it is a natural recruiting ground for future Sun employees and customers. Indeed, I have recruited at least three people into Sun from Aberystwyth alone. Over the years I have worked with the Universities of Aberystwyth, Bangor, Bradford, Dundee, Durham, Leeds, Liverpool, Manchester, Oxford, St Andrews, Salford, Warwick and York, most recently on Sysadmin day conferences in Aberystwyth and Manchester.