1 / 21

Stackless Python: programming the way Guido prevented it intended 

Stackless Python: programming the way Guido prevented it intended . Back To IPC9 developer‘s day. Why Stackless is Cool. Microthreads Generators (now obsolete) Coroutines. Microthreads. Very lightweight (can support thousands) Locks need not be OS resources

wei
Download Presentation

Stackless Python: programming the way Guido prevented it intended 

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stackless Python: programming the way Guido prevented it intended 

  2. Back To IPC9 developer‘s day

  3. Why Stackless is Cool • Microthreads • Generators (now obsolete) • Coroutines

  4. Microthreads • Very lightweight (can support thousands) • Locks need not be OS resources • Not for blocking I/O • A comfortable model for people used to real threads

  5. Coroutines Various ways to look at them • Peer to peer subroutines • Threads with voluntary swapping • Generators on steroids (args in, args out) What’s so cool about them • Both sides get to “drive” • Often can replace a state machine with something more intuitive[1] [1] Especially where the state machine features complex state but relatively simple events (or few events per state).

  6. Three Steps To Stacklessness • Get Python data off the C stack • Give each frame its own (Python) stackspace • Get rid of interpreter recursions Result • All frames are created equal • Stack overflows become memory errors • Pickling program state becomes conceivable (new: *has* been done)

  7. Getting rid of recursion is difficult • Often there is “post” processing involved • The C code (doing the recursing) may need its own “frame” • Possible Approaches • Tail optimized recursion • Transformation to loop Either way, the “post” code needs to be separated from the “setup” code. Ironic Note: This is exactly the kind of pain we seek to relieve the Python programmer of!

  8. Stackless Reincarnate • Completely different approach: • Nearly no changes to the Python core • Platform dependant • Few lines of assembly • No longer fighting the Python implementation • Orthogonal concepts

  9. Platform Specific Code __forceinline static int slp_switch(void) { int *stackref, stsizediff; __asm mov stackref, esp; SLP_SAVE_STATE(stackref, stsizediff); __asm { mov eax, stsizediff add esp, eax add ebp, eax } SLP_RESTORE_STATE(); } Note: There are no arguments, in order to simplify the code

  10. Support Macros 1(2) #define SLP_SAVE_STATE(stackref, stsizediff) \ {\ PyThreadState *tstate = PyThreadState_GET();\ PyCStackObject **cstprev = tstate->slp_state.tmp.cstprev;\ PyCStackObject *cst = tstate->slp_state.tmp.cst;\ int stsizeb;\ if (cstprev != NULL) {\ if (slp_cstack_new(cstprev, stackref) == NULL) return -1;\ stsizeb = (*cstprev)->ob_size * sizeof(int*);\ memcpy((*cstprev)->stack, (*cstprev)->startaddr - (*cstprev)->ob_size, stsizeb);\ (*cstprev)->frame = tstate->slp_state.tmp.fprev;\ }\ else\ stsizeb = (cst->startaddr - stackref) * sizeof(int*);\ if (cst == NULL) return 0;\ stsizediff = stsizeb - (cst->ob_size * sizeof(int*));\ Note: Arguments are passed via Threadstate for easy implementation

  11. Support Macros 2(2) #define SLP_RESTORE_STATE() \ tstate = PyThreadState_GET();\ cst = tstate->slp_state.tmp.cst;\ if (cst != NULL)\ memcpy(cst->startaddr - cst->ob_size, &cst->stack, (cst->ob_size) * sizeof(int*));\ return 0;\ }\

  12. Stacklessness via Stack Slicing • Pieces of the C stack are captured • Recursion limited by heap memory only • Stack pieces attached to frame objects • „One-shot continuation“

  13. Tasklets • Tasklets are the building blocks • Tasklets can be switched • They behave like tiny threads • They communicate via channels

  14. Tasklet Creation # a function that takes a channel as argument def simplefunc(chan): chan.receive() # a factory for some tasklets def simpletest(func, n): c = stackless.channel() gen = stackless.taskoutlet(func) for i in range(n): gen(c).run() return c

  15. Inside Tasklet Creation • Create frame „before call“ • Abuse of generator flag • Use „initial stub“ as a blueprint • slp_cstack_clone() • Parameterize with a frame object • Wrap into a tasklet object • Ready to run

  16. Channels • Known from OCCAM, Limbo, Alef • Channel.send(x) • activates a waiting tasklet with data • Blocks if none is waiting • y = Channel.receive() • Activates a waiting tasklet, returns data • Blocks if none is listening

  17. Planned Extensions • Async I/O in a platform independent way • Prioritized scheduling • High speed tasklets with extra stacks • Quick monitors which run between tasklets • Stack compression • Thread pickling • More channel features • Multiple wait on channel arrays

  18. Thread pickling • Has been implemented by TwinSun • Unfortunately for old Stackless • Analysis of the C stack necessary • By platform, only • Lots of work? • Only a few contexts need stack analysis • Show it !!!

  19. Stackless Sponsors • Ironport • Email server with dramatic throughput • Integrating their code with the new Stackless • Async I/O • CCPGames • Massive Multiplayer Online Game EVE • Porting their client code to new Stackless next week

More Related