slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CA integration tests PowerPoint Presentation
Download Presentation
CA integration tests

Loading in 2 Seconds...

play fullscreen
1 / 12

CA integration tests - PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on

CA integration tests. We need a way to run integration tests test IOCs -> CAJ -> pvmanager Including disconnects due to power cycle and network downtime Corner cases (e.g. different type at reconnect) Ability to check server state (e.g. number of monitors open)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CA integration tests' - kaveri


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
ca integration tests
CA integration tests
  • We need a way to run integration tests
    • test IOCs -> CAJ -> pvmanager
    • Including disconnects due to power cycle and network downtime
    • Corner cases (e.g. different type at reconnect)
    • Ability to check server state (e.g. number of monitors open)
    • Ability to drop in and run the tests in production environment (to check specific versions of EPICS and network configurations)
  • Start a script on the server side, start a script on the client side, come back in 15 minutes
ca integration tests1
CA integration tests
  • Server side:
    • Requirements: Epics base (softIoc), procserv
    • Start server script
      • Starts 1stsoftIoc
      • Keeps listening on the “command” pv. Possible commands:
        • start IOCNAME NSEC – stops the current ioc, waits for NSEC, starts the ioc in the IOCNAME directory
        • netpause NSEC – brings down the network (ifconfig down) for NSEC
        • connections PVNAME – puts the number of current monitors (casr 2) on the PVNAME in the “output” pv
        • stop – stops the server side
ca integration tests2
CA integration tests
  • Client side:
    • Library in pvmanager to make integration tests reasonable to write
    • Two phases
      • Run a series of tasks while recording all events that come out of pvmanager
      • Verify the order and number of events coming from pvmanager
    • If verification fails, you get a table with all the events gathered
slide5

public final void run() throws Exception {

init("typeChange1");

addReader(PVManager.read(channel("double-to-i32")), TimeDuration.ofHertz(50));

pause(1000);

restart("typeChange2");

pause(2000);

}

public final void verify(Log log) {

// Check double

log.matchConnections("double-to-i32", true, false, true);

log.matchValues("double-to-i32", ALL_EXCEPT_TIME,

newVDouble(0.0, newAlarm(AlarmSeverity.INVALID, "UDF_ALARM"), newTime(Timestamp.of(631152000, 0), null, false), displayNone()),

newVDouble(0.0, newAlarm(AlarmSeverity.UNDEFINED, "Disconnected"), newTime(Timestamp.of(631152000, 0), null, false), displayNone()),

newVInt(0, newAlarm(AlarmSeverity.INVALID, "UDF_ALARM"), newTime(Timestamp.of(631152000, 0), null, false), displayNone()));

}

ca integration tests3
CA integration tests
  • Covered
    • Simple reboot: connect pv, ioc down, ioc up, only 1 monitor open
    • Simple network outage: connect, network down, network up, only 1 monitor open
    • Multiple reboots: connect pv, ioc cycle 10 times
    • Type change: connect double pv, ioc cycle, pv become integer
    • Constant pv: conect to double/int/string/enum that do not change
    • Slow changing pv: conect to double updating at 1 Hz (same rate received)
    • Fast changing pv: conect to double updating at 100 Hz (reduced rate received)
    • Alarm changing pv: conect to double updating at 1 Hz for alarm only
    • Write pv: change value for double/int
  • Not yet covered
    • Add all remaining types for disconnection test
    • Add all types for type change
    • Add all types for slow changing pvs
    • Add all types for fast changing pvs
    • Add all types for alarm changing pvs
    • Add all types for write pvs
    • Add metadata changes
    • Add access control changes
    • Add multiple reader on a single pv (only 1 monitor open)
    • Add nanosec out of range for time
    • Old RTYP handling
review boy connection layer
Review BOY connection layer
  • Review connection layer in BOY to:
    • Solve concurrency issues
      • Likely cause of missed events
    • Investigate performance problems
      • Background load
      • Slow to open some screens (>5 sec)
    • Find better ways to integrate pvmanager
review boy connection layer1
Review BOY connection layer
  • Findings:
    • State of widgets accessed/changed from different threads without synchronizations
    • Simple.pvpvmanager implementation
      • uses 4 different synchronization methods, not well coordinated, some unneeded
        • synchronized, volatile, Atomic variable, thread-safe collections
      • Simple.pv interface forces to split calls to then re-merge them
        • E.g. connection/value are one callback in pvmanager, split into two, later recombined
      • Sets the pvmanager rate throttling at 50Hz and then does an additional throttling at 10Hz
      • Script interface: utility.pv implementation provides all values; pvmanager implementation does not
    • Different widgets with different needs go through the same code path
      • E.g. All widgets create a writer, even if they are monitors. Same code for both widgets that need queuing and widgets that need caching
review boy connection layer2
Review BOY connection layer
  • Changes on special branch:
    • Connecting BOY directly to pvmanager, skipping utility.pv
    • Making sure all events go on the UI thread
      • May solve missed events, but was never tested
    • Removed unnecessary context switches
      • Using pvmanager proper event throttling, removing EventBundlingThread
    • Added pause/resume when widgets out of screen
    • Script interface too problematic to touch
      • Hope was to re-implement rules on top of pvmanager
      • Can’t be done in general as rule user parameters are basically javascript pieces that are concatenated
        • No formal parsing or rule definition
review boy connection layer3
Review BOY connection layer
  • Background load
    • Sources of background load are different on different environment
      • On my development environment (Windows/Debian/Scientific Linux) the main source of load is SWT. Pause/Resume makes 64% load go to 4% when the window is hidden.
      • On one BNL production machine, the main source of load seems to be the synchronization used in the thread pool used by pvmanager during the active scanning. Pause/Resume has no significant benefit.
      • On another BNL production machine, the main load was SWT, but Pause/Resume had no effect.
      • Not OS dependent. Maybe hardware of hardware + OS combinations.
  • Slow load
    • Traced back to use of rules. Each rule is a script. Each script starts a scripting environment. Each scripting environment seems to load a lot of classes (interaction between classloaders and OSGI?). Loading of a screen with a large set of rules is stuck loading/unloading classing for several seconds.
review boy connection layer4
Review BOY connection layer
  • Takeaway:
    • Work that needs to be done in BOY
      • Finish proper pvmanager integration
      • Properly divide widget state (should all be in the model) so that real-time only updates that
      • Don’t just have one connection logic for all widget types
      • Understand how to implement rules on top of pvmanager(re-implement or migrate?)
    • Whoever does this work will not be able to do the testing himself; needs prompt support and feedback
      • Performance profile is significantly different
      • Concurrency issues are difficult to replicate
review boy connection layer5
Review BOY connection layer
  • Takeaway:
    • For pvmanager
      • Wrote 100 times on the blackboard: “My development environment is not a good approximation of all production environment”
        • Will prepare a performance benchmarking suite to gather data so I can keep track
      • Passive scanning got on the “toppish” of the list. Considering also implementing a different ExecutorService