1 / 25

JCMT Observatory Control System (and other stories)

JCMT Observatory Control System (and other stories). Nick Rees, Frossie Economou, Tim Jenness, Russell Kackley, Craig Walther, (JAC) Bill Dent, Martin Folger, Xiaofeng Gao, Dennis Kelly, John Lightfoot, Ian Pain, (ATC) Gary Hovey, (DRAO) Russell Redman (HIA). Our Credentials.

erling
Download Presentation

JCMT Observatory Control System (and other stories)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. JCMT Observatory Control System(and other stories) Nick Rees, Frossie Economou, Tim Jenness, Russell Kackley, Craig Walther, (JAC) Bill Dent, Martin Folger, Xiaofeng Gao, Dennis Kelly, John Lightfoot, Ian Pain, (ATC) Gary Hovey, (DRAO) Russell Redman (HIA)

  2. Our Credentials • How many of you have software that: • Runs on two or more telescopes? • With a wavelength range: long/ short > 1000? • With a full suite of common user instruments? • Offline fully-distributed observation preparation software? • Fully flexible queue scheduling software? • Publication quality data reduction pipeline? • Used for every instrument? (heterodyne next year @ JCMT) • Used for every observation? • Fault rate ~ 2% (software < 1%)? (only UKIRT)

  3. Problem… • 8 years ago we had a problem…. • JCMT observations were controlled by a ‘control task’ which was a mesh of spaghetti code: • Lots of conditional code for each receiver (if RxA do this, else if RxB do that else if RxW do something else) • Lots of complex parallel sequencing of the components of the controlled sub systems. • Compiled, FORTRAN, VAX/VMS - obsolete.

  4. Solution!!! • 4 years ago we presented a solution at SPIE 1998. • Telescope Observation Designer and Driver (TODD) • We defined and implemented a graphical programming language to create easy to understand observing recipes • To handle the parallelism problem it had simple parallel constructs similar to project scheduling software • It was written in Java (buzzword compliant and portable?) • We projected all the recipes would be written and the new system would be working by summer 1998 WRONG!

  5. Why??? • The new recipes were just as complex as the old FORTRAN control task. • The complexity was the same despite the new tool. • The best measure of complexity is a measure of the number of connections or control paths in the system • i.e. it is in the interface definitions and design. • The problem we were trying to solve was unchanged. • New tools are not necessarily better ones.

  6. What does this mean? • It’s the design, stupid!!! • The new tool had not fundamentally changed the design. • The largest benefit of a good programming tool is: • It provides infrastructure and/or • It encourages you to think of a problem in a different way. • However, the latter is not inherent - you can always implement a design in many ways (with many tools). • But the new tool had not done either of these anyway

  7. Anyway, what happened? • In 1999 we merged the JCMT and UKIRT software groups, and the JCMT OCS was one of the casualties. • Since merging the groups we have thrown away 5 staff years of work, spread across 3 projects. • Originally scheduled to total 2 staff years.

  8. Take two - OCS Analysis Classical Design: • Sequential set of simple commands issued by observer (or script) either at command line or GUI. • Unless instrument is telepathic or predictive, operational sequence is governed by the order commands are issued. • Sub-system has no way of knowing what sort of observing is being done until integration starts. • Complex parallel sequencing is required to ensure every sub-system is ready for integrating in the most optimal fashion.

  9. Take two - OCS Analysis • There has been one really important development • Astronomers now have to submit formal observing definitions before observing. • During observing, there are two important concepts: • Configuration • which means that observing is broken into sections in which the system configuration should stay as constant as possible to preserve the overall calibration. • Start of an integration • there is always a rendezvous point just before data is collected when all systems have to be ready to observe. • note there are one or more integration's/configuration.

  10. New Design • Since observations are fully defined, we changed the sub-system interfaces so they had a common set of simple commands: • CONFIGURE • Send a complete XML observing definition to each sub-system. • SETUP_SEQUENCE • Send just the few parameters which indicate how the next integration differs from the current configuration. • SEQUENCE • Execute the integration sequence - i.e. take data. • On JCMT we have a real-time sequencer that takes over here. • However, we had to write new sub-system interfaces • But its a well defined problem and is easy to debug and test.

  11. Resultant recipe Send XML definition with CONFIGURE command • configure( $tasks, $configuration ); • while ($next) • { • row++; • if (($row % $rows_per_cal) == 0) • { • $next = calibrate( $tasks, $start, 1, • $n_hotsamples, $n_skysamples, $n_ambsamples ); • if ($next) { $start = $next }; • } • if ($next != 0) • { • $next = integrate( $tasks, • { • "SOURCE" => "SCIENCE", • "INDEX" => $row, • "BEAM" => "MIDDLE", • "INTEG_TIME" => $integ_time, • "LOAD" => "SKY" • }, $start, 1 ); • if ($next) { $start = $next }; • } • } • # Do a final calibration • $next = calibrate( $tasks, $start, 1, • $n_hotsamples, $n_skysamples, $n_ambsamples ); • $next = end_observation( $tasks, $next-1 ) More integration sequences? N Y Send change of state SETUP_SEQUENCE command Start data taking with SEQUENCE command

  12. Don’t get tied up with tools to solve a complex problem, when the problem itself can be simplified. An offline observing definition tool that is basically an XML editor is a lot easier to write and test than an on-line control system. Lessons learned

  13. What is important? • Management support • System architect • Science driven, technology enabled • Interfaces • Concrete design requirements • Must haves, not nice to haves. • Generalize to another concrete system, rather than chase down scientific ‘once in a million year’ requests. • A pre-existing model. • A common way of thinking • Communication

  14. What isn’t important? • A particular language (as long as you don’t have too many) • A particular framework (as long as you have at least one)

  15. Management Support • The most important management skill is social • Organizational culture must make for happy workers • Take life seriously, yourself not at all • ESO didn’t really get their developers to adopt the VLT common software because of management clout. • Gianni - how do you do it?

  16. Learning Curves • Typically, it takes a good engineer: • 3 days to learn ‘something new’ • 3 months to get productive at it • 3 years to get really proficient at it • ‘Something new’ can be just about anything: • A job • A language • A framework • A paradigm • Typically, it takes the computing industry <<3 years to come up with a new fad.

  17. However… • Virtually all of our code could be written equally well in a number of different programming languages. • All of our infrastructure ‘framework’ could be equally well be something completely different. • Any of of our operating systems could be swapped with another equally good one. • It’s the design, stupid!!! (not the tools) • Lets stop all the religious wars, please.

  18. So... • Decide on a basic sub-set of all the stuff out there and get your staff proficient at it. • Learn a bit about many of the new tools out there, but think really hard before you adopt anything. • Balance it against 3 years/person of inefficiency. • FWIW, we use: • VxWorks, Solaris, Linux, (VAX/VMS) O/S. • ANSI C, Perl, (FORTRAN + Java) languages. • Drama, EPICS, SOAP, XML, Starlink, Perl infrastructure.

  19. What do you want in a framework? • Interface management (ALMA containers) • Command/response (synchronous control) • Publish/subscribe (asynchronous data, alarms) • Device abstraction • Scalable • efficient, naming services, point to point connections • Connection management • Error and alarm management • Standard tools • etc, etc, etc...

  20. Motivations • People on a project are motivated by various factors: • Fun, social camaraderie • Scientific • Technical • Financial • Bureaucratic • Most are, in fact, motivated by a couple…

  21. System architect • Understands and motivated by both the scientific and technical goals. This helps keep the project on track. • Must be really motivated by the science goals when the geek goals are boring. • Must be really motivated by the geek goals when science goals are boring. • Can rephrase stated requirements in different ways to find out if they are must haves, or nice to haves. • Oh… and they do the design as well. • Absolutely essential, and a rare breed. • Introduce a new software metric of Architects/m2? • JAC ~ 0.05 (note to self, must introduce some cubicles)

  22. Interfaces • Without them, preferably written down, but at least well understood, there is no design. • The huge manpower gains happen when the architect(s) fiddles with the level of the interfaces. • Design them early, freeze them late. • Can be simplified by using structured data formats that can be passed through intermediate software layers, without interpretation. • Try and have a few, simple interfaces. • Try and make them generic.

  23. Real-time issues • Observatory control system is not normally real-time. • Some real time control in low-level systems (e.g. servos and array control) • These are mostly very simple sequential operations with few complex scheduling constraints. • Only real issue is rendezvous at start of integration if having to synchronize with moving system (e.g. raster scanning or rotating polarimeter), and subsequent timing during until the integration finishes.

  24. Finally...

More Related