1 / 37

LDC : The LLVM-based D Compiler Using LLVM as backend for a D compiler

LDC : The LLVM-based D Compiler Using LLVM as backend for a D compiler. Kai Nacke 02/02/14 LLVM devroom @ FOSDEM´ 14. Agenda. Brief introduction to D Internals of the LDC compiler Used LLVM features Possible improvements of LLVM. What is D?. C-like syntax Static typing

hija
Download Presentation

LDC : The LLVM-based D Compiler Using LLVM as backend for a D compiler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LDC: The LLVM-based D CompilerUsing LLVM as backend for a D compiler Kai Nacke 02/02/14 LLVM devroom @ FOSDEM´14

  2. Agenda • Brief introduction to D • Internals of the LDC compiler • Used LLVM features • Possible improvements of LLVM

  3. What is D? • C-like syntax • Static typing • Supports many paradigms • Polymorphism, functional style, generics, contract programming • Scales up to large projects • Modules, interfaces, unit tests • Convenient and powerful features • Garbage collection, array slices, compile-time function execution (CTFE)

  4. Code Examples import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes). map!(a => a.idup). array. sort. copy(stdout.lockingTextWriter()); } void main() { import std.stdio; writeln("Hello FOSDEM!"); } module myalgo; import std.traits; T gcd(T)(T a, T b) pure if(isIntegral!T) { while (b) { auto t = b; b = a % b; a = t; } return a; } unittest { // CTFE enum val1 = gcd(3, 5); assert(val1 == 1); // No CTFE assert(gcd(25, 35)== 5); }

  5. Code Examples - hello import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes). map!(a => a.idup). array. sort. copy(stdout.lockingTextWriter()); } void main() { import std.stdio; writeln("Hello FOSDEM!"); } voidmain() { importstd.stdio; writeln("Hello FOSDEM!"); } module myalgo; import std.traits; T gcd(T)(T a, T b) pure nothrow if(isIntegral!T) { while (b) { auto t = b; b = a % b; a = t; } return a; } unittest { // CTFE enum val1 = gcd(3, 5); assert(val1 == 1); // No CTFE assert(gcd(25, 35)== 5); }

  6. Code Examples - gcd module myalgo; import std.traits; Tgcd(T)(Ta, Tb) pureif(isIntegral!T) { while (b) { autot = b; b = a % b; a = t; } returna; } unittest{ // CTFE enumval= gcd(3, 5); assert(val== 1); // No CTFE assert(gcd(25, 35)== 5); } import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes). map!(a => a.idup). array. sort. copy(stdout.lockingTextWriter()); } void main() { import std.stdio; writeln("Hello FOSDEM!"); } module myalgo; import std.traits; T gcd(T)(T a, T b) pure nothrow if(isIntegral!T) { while (b) { auto t = b; b = a % b; a = t; } return a; } unittest { // CTFE enum val1 = gcd(3, 5); assert(val1 == 1); // No CTFE assert(gcd(25, 35)== 5); }

  7. Code Examples - sort import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes). map!(a => a.idup). array. sort. copy(stdout.lockingTextWriter()); } void main() { import std.stdio; writeln("Hello FOSDEM!"); } import std.stdio; import std.array; import std.algorithm; void main() { stdin.byLine(KeepTerminator.yes). map!(a => a.idup). array. sort. copy(stdout.lockingTextWriter()); } module myalgo; import std.traits; T gcd(T)(T a, T b) pure nothrow if(isIntegral!T) { while (b) { auto t = b; b = a % b; a = t; } return a; } unittest { // CTFE enum val1 = gcd(3, 5); assert(val1 == 1); // No CTFE assert(gcd(25, 35)== 5); }

  8. The case for D • 18th place on the TIOBE index • Companies adopting D • Books about D

  9. LDC = + What about combining the D frontend with LLVM? The idea is about 10 years old! The result is LDC.

  10. Facts about LDC • Version 0.13.0 alpha recently announced • Written in C++ (transition to D planned) • Requires LLVM 3.1 or later • Runs on most Posix-like x86/x86_64 OS‘s • Linux, OS X, FreeBSD, Mingw32 • Native Windows version depends on LLVM • Port to Linux/PPC64 almost finished • Work on port to Linux/ARM has started

  11. The architecture of LDC • Typical multi-pass compiler • Lexer, Parser, Analyzer from DMD • Type mapping and IR generation • Code generation with LLVM • Illustrates Conway‘s law Driver Lexer, Parser, Analyzer Type mapping IR generation LLVM

  12. Closer look at data flow a.out.a.so .d Creates Abstract Syntax Tree (AST) Reads sourcefiles With gcc/clang Lexer, Parser, Analyzer Driver LLVM API LLVM Type mapping IR generation .o.s.ll

  13. Abstract Syntax Tree from Frontend • Frontend generates fully decorated AST if (b > 0) { writefln("b > 0"); } else writefln("b <= 0"); IfStatement Condition : CmpExpr Ifbody : ScopeStatement Elsebody : ExpStatement

  14. AST • Frontend already lowers some statements • Simplifies IR generation • Complicates generation of debug info try { /* stmts */ } finally { ... } scope(exit) { ... } /* stmts */ for (intkey = 1; key < 10; ++key) { /* stmts */ } foreach(a; 1..10) { /* stmts */ }

  15. IR generation • IR is generated with a tree traversal • Uses visitor provided by frontend • New code, it is different in current release • IR is attached to main entities • Example: FuncDeclaration <- IrFunction • Generated IR could be improved

  16. IR generation example - if inta; if (a > 0) ... else ... %tmp = loadi32*%a %tmp1 = icmpsgti32%tmp, 0 bri1%tmp1, label %if, label%else if: ... brlabel %endif else: ... brlabel %endif endif:

  17. IR generation example - for for (inti = 0;i < a.length; ++i) { ... } ... brlabel%forcond forcond: ... bri1%tmp6, label %forbody, label %endfor forbody: ... brlabel %forinc forinc: ... brlabel %forcond endfor:

  18. Type mapping • Mapping of simple types • Mapping of type bool • Use of i1 for bool turned out to be wrong! byte / ubyte short / ushort int / uint long / ulong float double real i8 i16 i32 i64 float double x86_fp80 / ppc_fp80 / double bool i8

  19. Type mapping • Static arrays are mapped to LLVM arrays • Dynamic arrays are mapped as anonymous structs with length and pointer to data • Associative arrays are opaque to the compiler int[5] [5 x i32] int[] { i64, i32* } int[string] i8*

  20. Type mapping • Structs are mapped to LLVM structs • Classes are structs with a vtable and a monitor • Padding is added if required struct { byte a; byte b; short c; } type{ i8, i8, i16 } class { longa; } type{ %type.ClassA.__vtbl*, i8*, i64 }

  21. D-specific LLVM passes • LLVM knows nothing about D and its runtime library! • Adding D-specific knowledge to LLVM helps to optimize the code • Currently these passes are available • GarbageCollect2Stack • SimplifyDRuntimeCalls • StripExternals

  22. D-specific LLVM passes • GarbageCollect2Stack • Tries to turn a GC allocation into a stack allocation • Useful for closures intb = 5; foreach (inti; 0..10) { apply(a => b*a, i); b = 2*b; } • Requires memory allocation for nested frame • Can be turned into alloca if memory does not escape function • Based on PointerMayBeCaptured • Conservative in loops, can be improved

  23. D-specific LLVM passes • SimplifyDRuntimeCalls • Replaces/optimizes calls of D runtime, mainly for arrays • Framework copied from SimplifyLibcalls • StripExternals • Removes body of functions declared as available_externally • Used as support for global dead code elimination (GlobalDCE)

  24. Porting to new platforms • Required LLVM features • Thread-local storage (TLS) • Exception handling • Anonymous structs • Nice to have • Inline assembler • Debug symbols • Some intrinsics • Features are not supported by all LLVM targets

  25. Porting to new platforms • Common problems found in LLVM • TLS is not implemented / partially implemented / buggy • Wrong relocation generated • Most of the time TLS relocations • E.g. on Linux/ARM – missing R_ARM_TLS_LDO32 localhosttmp # /build/work/ldc/bin/ldc2 -g hello.d /usr/lib/gcc/armv7a-hardfloat-linux-gnueabi/4.8.2/../../../../armv7a-hardfloat-linux-gnueabi/bin/ld: /build/work/ldc/runtime/../lib/libphobos-ldc-debug.a(random-debug.o)(.debug_info+0x31): R_ARM_ABS32 used with TLS symbol _D3std6random17unpredictableSeedFNdZk6seededb

  26. Porting to new platforms • Recommendation for porting Always use LLVM trunk! • A new port can improve quality of LLVM • E.g. PowerPC64: 17 bug reports • Many interesting platforms are yet unsupported (MIPS, AArch64, Sparc, ...)

  27. Inline ASM • LDC supports DMD-style ASM on x86/x86_64 • Inline assembler requires parsing of statements and construction of constraints • Naked functions are translated to modul-level inline assembly • Is not inlined by default asm { naked; movRAX, 8; movRAX, GS:[RAX]; ret; }

  28. Inline ASM • LLVM ASM is supported on all platforms via a special template • Prefered way because it can be inlined • No debug info generated importldc.llvmasm; return __asm!(void*)("movq %gs:8, %rax", "={rax}");

  29. Inline IR • IR can be inlined via a special template • Be aware that IR is not platform independent! pragma(LDC_inline_ir) RinlineIR(strings, R, P...)(P); void*getStackBottom(){ returninlineIR!(` %ptr = inttoptri64%0toi64addrspace(256)* %val = loadi64addrspace(256)* %ptr, align1 %tmp = inttoptri64%valtoi8* reti8*%tmp`, void*, ulong)(8); }

  30. Inline IR • Example translates to (EH removed) • Useful to access IR features which are otherwise not available (e.g. shufflevector) • Best result because of tight integration with LLVM • No debug info generated _D3thr14getStackBottomFZPv: movq %gs:8, %rax retq

  31. Attributes • Used to change LLVM function attributes • Some attributes require more work • Experimental feature, not yet finished importldc.attribute; @attribute("alwaysinline") voidfunc() { // ... }

  32. Integrating AdressSanitizer • Integrate sanitizer passes into compile • opt is used as blueprint • Add new option • -sanitize=address • Add attribute SanitizeAddress to every function definition

  33. Integrating AdressSanitizer • Compile runtime library with new option • Still missing: runtime support • Own allocator with GC • Use gcstub/gc.d instead • Produces some aborts in unit tests on Linux/x86_64 • Evaluation of reports not complete

  34. Better ABI support • Frontend must have intimate knowledge of calling convention • Degree of implementation of attributes varies • PPC64: Good • Win64: Needs more work (byval with structs) • LDC uses an abi class to encapsulate the details for each supported platform • Improvement: more complete implementation

  35. Better ABI support • Improvement: Implement helper for C ABI lowering in LLVM • It is the default ABI in LLVM and every major language needs this • Think of the effort required for MSC ABI in Clang • Even better: Abstract the details away

  36. Better Windows support • No exception handling on native Windows • Most wanted feature by LDC users! • I am working on the implementation • There is already a Clang driver in PR18654 • No CodeView debug symbols • COFF line number support added recently • I started to work on this topic (a year ago...)

  37. Looking for contributors If you want to have fun with ... a cool language ... a friendly community ... hacking LLVM then start contributing to LDC today! http://forum.dlang.org/group/digitalmars.D.ldc https://wiki.dlang.org/LDC https://github.com/ldc-developers

More Related