1 / 34

History, Architecture, and Implementation of the CLR Serialization and Formatter Classes

History, Architecture, and Implementation of the CLR Serialization and Formatter Classes. Peter de Jong April 24, 2003. History. J++ DCOM 1997 J++ SOAP 1998 CLR .Net Remoting 1999 Spring CLR Serialization Classes 1999 Spring CLR SoapFormatter 1999 Spring CLR BinaryFormatter 1999 December

nikki
Download Presentation

History, Architecture, and Implementation of the CLR Serialization and Formatter Classes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. History, Architecture, and Implementation of the CLR Serialization and Formatter Classes Peter de Jong April 24, 2003

  2. History • J++ DCOM 1997 • J++ SOAP 1998 • CLR .Net Remoting 1999 Spring • CLR Serialization Classes 1999 Spring • CLR SoapFormatter 1999 Spring • CLR BinaryFormatter 1999 December • CLR V1 2002 January

  3. Original Soap Spec (Bob Atkinson) 1997 Protocol HTTP Bi-Directional Give me a call - Server callback using response from a hanging http request. XML No namespaces, no xsd RPC Soap Header root for Soap Headers and parameter graph No Envelope J++ Proxy/Stub for serialization/deserialization of Interface parameters J++ Soap Http Server Client Soap Root Parameters Soap Headers

  4. CLR Soap • Soap .9 spec • Section 5 specifies how to map objects • Namespaces, no xsd • Soap Envelope • Rpc - rooted Headers and Parameters • Serialization – root of object graph • Most annoying part • Headers are really an array of objects • For XML beauty specified as xml field elements. • Lead to specification of root attribute

  5. Soap Moving Target • Original Soap • Soap .9 • Soap as a cottage industry • Easy to produce a subset of soap • Microsoft had 5 or so implementations • Individuals and companies set up Soap Web sites • Soap Interop Meeting (IBM 2000-2001) • Soap Application Bench marks • Led to Web sites which implemented the Applications • ~15 sites to test interoperability • Soap 1.0 • Standards effort which included many of the Soap producers. • Envelope, body - no header or parameter root • Moved Section 5 to an appendix • Soap 1.1 • Nest top level object

  6. Serialization Classes

  7. Architecture BinaryFormatter SoapFormatter Serializer ----------- Parser Serializer ----------- Parser Binary Stream Soap XML Stream Object Reader ---------------------- Object Writer Object Reader ---------------------- Object Writer Object Reader ---------------------- Object Writer Serialization Classes

  8. Serialization Classes • Designed to make it easy to produce Formatters. • True for a subset of CLR • False for the complete CLR object model • SoapFormatter and BinaryFormatter are the only Serialization/Deserialization engines which support the complete CLR model.

  9. Serialization Classes Services • System controlled serialization (Serializable, NotSerialized) • User controlled serialization (ISerializable) • Type substitution (ISerializationSurrogate, ISurrogateSelector) • Object Substitution (IObjectReference) • Object Sharing Fixups

  10. System Controlled Serialization • Serialization • Serialization Custom Attribute • NotSerialized Customer Attribute • public, internal, private fields serialized • Deserialization • Creates Uninitialized object • Populates the fields • Constructor is not called

  11. User Controlled Serialization • Inherits from ISerializable • Serialization – GetObjectData give name/value pairs to serializer • Deserialization – Constructor used to retrieve name/value pairs and populate object. • Constructor is not in Interface, so compiler can’t check whether it present • Constructor isn’t inherited, so each subclass needs its own constructor • Earlier version used SetObjectData instead of constructor

  12. Surrogates • Type substitution • Objects of specified type replaced by a new object of a different type. MarshalByRefObject Proxy ObjRef

  13. Object Substitution • IObjectReference • GetRealObject method returns deserialized object • When object is returned, it and its descendents are completely deserialized • Used extensively for returning singleton system objects • Types, Delegates

  14. Reference before object Serialization swizzles objref to integer Object Fixup

  15. Object Fixup Complications • Value classes must be fixed up before boxed • ISerializable directly referenced object graphs must be deserialized one level • IObjectReference object graph must be completely deserialized

  16. IDeserializationCallBack • Used to signal that deserialization is complete • E.g. Hashtable can’t create hashes until all the objects are deserialized.

  17. Formatter Classes

  18. IFormatter Object Graph • Serialize(Stream s, Object graph) • Object Deserialize(Stream s) • Properties • ISurrogateSelector • SerializationBinder (Type substitution when deserializing) • StreamingContext • CrossProcess • CrossMachine • File • Persistence • Remoting • Other • Clone • CrossAppDomain • All

  19. IRemotingFormatter - RPC • Serialize(Stream s, Object graph, Header[] headers) • Two Serializations • Graph (parameter array) • Headers (Header array) • Object Deserialize(Stream s, HeaderHandler handler) • Delegate Object HeaderHandler(Headers[] headers) • Headers handed to delegate, delegate returns object into which parameters are deserialized.

  20. Formatter Property Enums • FormatterTypeStyle • TypesWhenNeeded – types outputted for • Arrays of Objects • Object fields, inheritable fields • ISerializable • TypesAlways • version compatibility • MemberInfo -> ISerializable • FormatterAssemblyStyle • Simple – No version information • Full – Full assembly name Defaults Remoting – Serialization Full, Deserialization Simple Non-Remoting – Serialization Full, Deserialization Full

  21. SoapFormatter additional Properties • ISoapMessage – Alternate way of specifying Parameter/Header serialization. • ParamNames • ParamValues • ParamTypes • MethodName • XmlNameSpace • Header[] headers

  22. BinaryFormatter • Binary Stream Format Design • Primitive types are written directly • Array of primitives - bytes are copied directly from the CLR (100x faster then using reflection) • All other types are written as records • Basic record types • SerializedStreamHeader, Object, ObjectWithMap, ObjectWithMapAssemId, ObjectWithMapTyped, ObjectWithMapTypedAssemId, ObjectString, Array, MemberPrimitiveTyped, MemberReference, ObjectNull, MessageEnd, Assembly • Record types added later for performance • ObjectNullMultiple256, ObjectNullMultiple, ArraySinglePrimitive, ArraySingleObject, ArraySingleString, CrossAppDomainMap, CrossAppDomainString, CrossAppDomainAssembly, MethodCall, MethodReturn

  23. Serialization 1 5 1 2 2 6 3 3 4 7 5 6 4 8 7 9 8 9 10 10

  24. Serialization Complications • MethodCall/MethodReturn • CrossAppDomain • Determine when Type information is needed • Value classes are nested/Non-Value classes are top level • Arrays – mix of jagged and multi-dimensional [][,,][] • Array of primitives copied to stream as a collection of bytes • Surrogates • ISerializable

  25. Deserialization 5 1 2 6 3 7 Fixups Process 1, fixups 2, 3, 4 Process 2, fixups 5,6 Process 3, fixups 7 Process 4, fixups 8,9 4 8 9 10

  26. Deserialization Binary • Parsing • Record Headers specify what is coming next in stream • Primitives do not have headers so need to use previously encountered record headers as map for reading primitive

  27. Deserialization Complications • Remoting • MethodCall/MethodReturn optimization • CrossAppDomain • Value Type • ISerializable • Surrogate

  28. Retrospective

  29. What Went Wrong -1 • Beta1 gave GC a workout • Object oriented style is dangerous for plumbing. Lots of objects created. • Solution • Use object singletons (or fixed number) • Object pools • Start with larger storage for growing objects such as ArrayLists • Special cases – Primitive parameters - serialization classes aren’t used so aren’t initialized.

  30. What Went Wrong - 2 • Performance is never good enough • Reflection is slow • Boxes value types • Interpretive • Serialization classes are slow • Boxes value types • Keeps lots of state around in resizable arrays

  31. What Went Wrong - 3 • Formatters are slow • Object type and field information inflates size of stream (reflection and versioning requirement) • Lots of irregular cases • Clr – value types, singletons, transformations • Serialization – ISerializable, Resolving graph rules • Code more general then it has to be • now we know, but during development underlying system kept changing • Clr object model (variants, reflection, security, BCL, etc) • Serialization model (ISerializable underwent many changes) • Soap spec kept changing • Binary Format changed for perf reasons • Fixups used too much – strings and value classes are put in stream when encountered, object references are put in stream, with object coming later • Soap 1.2 nests reference objects • BinaryFormatter should be changed to nest objects

  32. What Went Wrong -4 • Why didn’t we use Reflection.Emit • 1200 serialization to make up cost • Couldn’t serialize private and internal fields • BinaryFormatter Primitive Arrays uses array copy rather then reflection • 100x faster when switch was made • Cross Appdomain smuggling • Primitive and strings bypasses the BinaryFormatter results in faster times then COM cross process • BinaryFormatter prototyped option to omit type information in stream • 4 byte point class serialized in 10 bytes instead of 125 bytes. • Future version of the Formatters will be much faster • Improvements to Reflection.Emit • Cross Appdomain Serialization Prototype implemented in the EE.

  33. What Went Wrong - 5 • Web Services • The BinaryFormatter and SoapFormatter existed before Web Service classes • Serialization, Formatter, and Remoting classes are based on object oriented programming, RPC and COM models • Web Services started to gain importance late in the development of the .Net Frameworks • Future releases will combine the two models, use same custom attributes and underlying messaging model • SoapFormatter • Specify shape of stream to some extent • Object WSDL, added additional schema information to WSDL to allow generation of the CLR object model in client proxies • Object WSDL is only way in .Net Frameworks V1 to copy clr metadata without copying dll which includes code

  34. The Formatters are Great (at least useful) • Only way to make a deep copy of an object graph with complete fidelity • Integrated with .Net Remoting • Combines the CLR Object Model with the Web Services Model • Version resilient (at least the attempt is made) • Secure • Perf isn’t all that bad

More Related