SlideShare a Scribd company logo
1 of 68
Compilers Design and Construction
Lecture 1
Chapter 1
• Any program written in a programming language
must be translated before it can be executed.
• This translation is typically accomplished by a
software system called compiler.
• This course aims to introduce the principles and
techniques used to perform this translation and the
issues that arise in the construction of a compiler.
Course Aims
Our course Talk about :
3
Plan
Week Subject Reading
1 Introduction Ch 1
2 Lexical analysis Ch 2-3
3,4 Syntax analysis Ch 4-5
5 Syntax, Semantic analysis Ch 4-5
Ch 6
6 Semantic analysis Ch 6
7 Mid
8 Intermediate code Ch 7-8
9,10,11 Control flow , Code generation Ch 9 –10
12 code optimization Ch 10
Assessments
Topic Mark
lab 20%
Mid Term Exam 15%
Presence 5%
Final Term Exam 60%
6
Learning Outcomes:
• A student successfully completing this course should be able to:
• understand the principles governing all phases of the compilation
process.
• understand the role of each of the basic components of a
standard compiler.
• show awareness of the problems of and methods and techniques
applied to each phase of the compilation process.
• apply standard techniques to solve basic problems that arise in
compiler construction.
• understand how the compiler can take advantage of particular
processor characteristics to generate good code.
References
• Class textbook
• Compilers: Principles, Techniques, and
Tools by Aho, Sethi, and Ullman
• Other useful books
• Advanced Compiler Design &
Implementation, Steven Muchnick
• Building an Optimizing Compiler,
Robert Morgan
• Modern Compiler Implementation in
Java, Andrew Appel
Software Categories
• System SW
• Programs written for computer systems
• Compilers, operating systems, …
• Application SW
• Programs written for computer users
• Word-processors, spreadsheets, & other application packages
A Layered View of a Computer from the perspective of compiler
Machine with all its hardware
System Software
Compilers, Interpreters, Preprocessors, etc.
Operating System, Device Drivers
Application Programs
Word-Processors, Spreadsheets,
Database Software, IDEs,
etc…
Programs
• Any program can be written in any programming language
• A programming language(PL) is
• A set of rules and symbols used to construct a computer program
• A language used to interact with the computer
11
Why study Compilation Technology?
• Success stories (one of the earliest branches in CS)
• Applying theory to practice (scanning, parsing, static analysis)
• Ideas from different parts of computer science are involved:
• AI: Heuristic search techniques; greedy algorithms - Algorithms: graph
algorithms - Theory: pattern matching - Also: Systems, Architecture
• Compiler construction can be challenging and fun:
• new architectures always create new challenges; success requires
mastery of complex interactions; results are useful; opportunity to
achieve performance.
CS Expert
Programmer
Simple User
Manually Problem that needs to be solved
automatically
Make SW to solve specific problem
Make SW to compile any program
13
Principles of Compilation
The compiler must:
• preserve the meaning of the program being compiled.
• “improve” the source code in some way.
• Space (size of compiled code)
• Feedback (information provided to the user)
• Debugging
• Compilation time efficiency (fast or slow compiler?)
Introduction
chapter 1
Compilers
• “Compilation”
• Translation of a program written in a source language
into a semantically equivalent program written in a
target language
Compiler
Error messages
Source
Program
Target
Program
Input
Output
Target program : an executable machine-language program.
Interpreters
• “Interpretation”
• Performing the operations implied by the source program
Interpreter
Source
Program
Input
Output
Error messages
History
IBM developed 704 in 1954. All programming was done in assembly
language. Cost of software development far exceeded cost of hardware.
Low productivity.
• Speedcoding interpreter: programs ran about 10 times slower than
hand written assembly code
• John Backus (in 1954): Proposed a program that translated high level
expressions into native machine code. Skeptism all around. Most people
thought it was impossible
• Fortran I project (1954-1957): The
first compiler was released
Fortran I
• The first compiler had a huge impact on the programming
languages and computer science. The whole new field of
compiler design was started.
• More than half the programmers were using Fortran by 1958.
• The development time was cut down to half.
• Modern compilers preserve the basic structure of the Fortran I
compiler !!!
Computer Languages
– Machine Language
• Uses binary code
• Machine-dependent
• Not portable
• Assembly Language
• Uses mnemonics(list of words to remembers)
• Machine-dependent
• Not usually portable
• High-Level Language (HLL)
• Uses English-like language
• Portable (but must be compiled for different platforms)
• Examples: Pascal, C, C++, Java, Fortran, . . .
Machine Language
• The representation of a computer program which is actually read and understood
by the computer.
• A program in machine code consists of a sequence of machine instructions.
• Instructions:
• Machine instructions are in binary code
• Instructions specify operations and memory cells involved in the operation
Example:
Operation Address
0010 0000 0000 0100
0100 0000 0000 0101
0011 0000 0000 0110
Assembly Language
A symbolic representation of the machine language of a specific processor.
Is converted to machine code by an assembler.
Each line of assembly code produces one machine instruction (One-to-one correspondence).
Programming in assembly language is slow and error-prone but is more efficient in terms of
hardware performance.
Mnemonic representation of the instructions and data
Example:
Load Price
Add Tax
Store Cost
High-level language
• A programming language which use statements consisting of English-like keywords
such as "FOR", "PRINT" or “IF“, ... etc.
• Each statement corresponds to several machine language instructions (one-to-many
correspondence).
• Much easier to program than in assembly language.
• Operations can be described using familiar symbols
• Example:
Cost = Price + Tax
Compilers: The Big picture
Editors , Preprocessors , Linker & Loader
• - Editors
• Compiler have been bundled together with editor and other programs into an interactive
development environment (IDE)
• May include some operations of a compiler, informing some errors
• - Preprocessors
• Delete comments, include other files, and perform macro substitutions
• - Linkers
• Collect separate object files into a directly executable file
• Connect an object program to the code for standard library functions and to resource supplied by OS
• - Loaders
• Resolve all re-locatable address relative to a given base
• Make executable code more flexible
Compiling and running C programs
Editor
Compiler
Linker
Source code
file.c
Object code
file.obj
Executable code
file.exe
Libraries
Debuggers
• Used to determine execution error in a compiled program
• Keep tracks of most or all of the source code information
• Stop execution at pre-specified locations called breakpoints
Debugging programerrors
Editor
Compiler
Linker
Source code
file.c
Object code
file.obj
Executable code
file.exe
Libraries
Syntactic
Errors
Semantic
Errors
Interpreters
• Execute the source program immediately rather than generating
object code
• Examples: BASIC, LISP, used often in educational or development
situations
• Speed of execution is slower than compiled code
• Share many of their operations with compilers
How to translate?
• Direct translation is difficult. Why?
• • Source code and machine code mismatch in level of abstraction
• – Variables vs Memory locations/registers
• – Functions vs jump/return
• – Parameter passing
• – structs
• • Some languages are farther from machine code than others
• – For example, languages supporting Object Oriented Paradigm
How to translate easily?
• Translate in steps. Each step handles a reasonably simple, logical, and
well defined task
• • Design a series of program representations
• • Intermediate representations should be amenable to program
manipulation of various kinds (type checking, optimization, code
generation etc.)
• • Representations become more machine specific and less language
specific as the translation proceeds
The first few steps
• The first few steps can be understood by analogies to how humans
comprehend a natural language
• • The first step is recognizing/knowing alphabets of a language. For
example
• – English text consists of lower and upper case alphabets, digits,
punctuations and white spaces
• –Written programs consist of characters from the ASCII characters set
(normally 9-13, 32-126)
The first few steps
• The next step to understand the sentence is recognizing words
• –How to recognize English words?
• –Words found in standard dictionaries
• –Dictionaries are updated regularly
The first few steps
• How to recognize words in a programming language?
• – a dictionary (of keywords etc.)
• – rules for constructing words (identifiers, numbers etc.)
• • This is called lexical analysis
• • Recognizing words is not completely trivial.
• For example: w hat ist his se nte nce?
Lexical Analysis: Challenges
• • We must know what the word separators are
• • The language must define rules for breaking a sentence into a
sequence of words.
• • Normally white spaces and punctuations are word separators in
languages.
Lexical Analysis: Challenges
• • In programming languages a character from a different class may also
be treated as word separator.
• • The lexical analyzer breaks a sentence into a sequence of words or
tokens:
• – If a == b then a = 1 ; else a = 2 ;
• – Sequence of words (total 14 words)
• if a == b then a = 1 ; else a = 2 ;
The next step
• • Once the words are understood, the next step is to understand the
structure of the sentence
• • The process is known as syntax checking or parsing
Parsing
Parsing a program is exactly the same process as shown in
previous slide.
• Consider an expression
if x == y then z = 1 else z = 2
Understanding the meaning
• • Once the sentence structure is understood we try to
understand the meaning of the sentence (semantic
analysis)
• • A challenging task
• • Example: Prateek said Nitin left his assignment at home
• • What does his refer to? Prateek or Nitin?
Understanding the meaning
• • Worse case Amit said Amit left his assignment at
home
• • Even worse Amit said Amit left Amit’s assignment at
home
• • How many Amits are there? Which one left the
assignment? Whose assignment got left?
Semantic Analysis
• • Too hard for compilers.
• They do not have capabilities similar to human understanding
• • However, compilers do perform analysis to understand the meaning
and catch inconsistencies
• • Programming languages define strict rules to avoid such ambiguities
• { int Amit = 3;
{
int Amit = 4;
cout << Amit;
}
• }
More on Semantic Analysis
• • Compilers perform many other checks besides variable
bindings
• • Type checking Amit left her work at home
• • There is a type mismatch between her and Amit. Presumably
Amit is a male. And they are not the same person.
Code Optimization
• • No strong counter part with English, but is similar to
editing/précis writing
• • Automatically modify programs so that they
• –Run faster
• –Use less resources (memory, registers, space, fewer fetches
etc.)
Code Optimization
• • Some common optimizations
• –Common sub-expression elimination
• –Copy propagation
• –Dead code elimination
• –Code motion
• –Strength reduction
• –Constant folding
• • Example: x = 15 * 3 is transformed to x = 45
Compiler
Compilers
• Analysis of the source program.
• Synthesis into a machine-language program.
1
2 3
Parts of Compilers
1. Lexical Analysis
2. Syntax Analysis
3. Semantic Analysis
4. Code Generation
5. Optimization
Analysis
Synthesis
Front
End
Back
End
Compilers
• Analysis
• Front End
• Split source code into
different constitute
pieces(token).
• Put the pieces based on
grammatical rules(Parse).
• Report Errors.
• Synthesis
• Back End
• Produce intermediate code
• Optimize Intermediate code
• Generate target
code(machine language
code)
48
Structure of a Compiler
• Front end: analysis
• Read source program and understand its structure and meaning
• Back end: synthesis
• Generate equivalent target language program
Source Target
Front End Back End
Phases of a Compiler
49
Code
Generator
Code
Optimizer
Intermediate
Code
Generator
Semantic
Analyzer
Syntax
Analyzer
Lexical
Analyzer
Error Handler
Symbol Table
Manager
Target
Program
Source
Program
The Structure of a Compiler
50
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all Phases of The Compiler)
(Character Stream)
Intermediate
Representation
Target machine code
Analysis phase
Synthesis phase
by Neng-Fa Zhou
Analysis source program
lexical analyzer
syntax analyzer
semantic analyzer
source program
tokens
parse trees
parse trees
The Structure of a Compiler
52
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Scanner (Lexical Analyzer)
The scanner begins the analysis of the source program by reading the
input, character by character, and grouping characters into individual
words and symbols (tokens)
 Puts information about identifiers into the symbol table.
 Regular expressions are used to describe tokens (lexical
constructs).
 A (Deterministic) Finite State Automaton can be used in the
implementation of a lexical analyzer.
(Character Stream)
Intermediate
Representation
Target machine code
53
Scanner (Lexical Analyzer)
Ex: newval = oldval + 12
tokens:
newval identifier
= assignment operator
oldval identifier
+ add operator
12 a number
tokens
The Structure of a Compiler
54
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Parser (Syntax Analyzer)
 Given a formal syntax specification (typically as a context-free grammar [CFG] ),
the parse reads tokens and groups them into units as specified by the productions
of the CFG being used.
 As syntactic structure is recognized, the parser either calls corresponding semantic
routines directly or builds a syntax tree.
 CFG ( Context-Free Grammar )
 BNF ( Backus-Naur Form )
 GAA ( Grammar Analysis Algorithms )
(Character Stream)
Intermediate
Representation
Target machine code
55
Parser (Syntax Analyzer)
• A Syntax Analyzer creates the syntactic structure (generally a
parse tree) of the given program.
• A syntax analyzer is also called as a parser.
• A parse tree describes a syntactic structure.
parse tree
56
Parser (Syntax Analyzer (CFG) )
• The syntax of a language is specified by a context free grammar
(CFG).
• The rules in a CFG are mostly recursive.
• A syntax analyzer checks whether a given program satisfies the rules
implied by a CFG or not.
• If it satisfies, the syntax analyzer creates a parse tree for the given program.
• Ex: We use BNF (Backus Naur Form) to specify a CFG
assgstmt -> identifier := expression
expression -> identifier
expression -> number
expression -> expression + expression
57
Syntax Analyzer versus Lexical Analyzer
• Which constructs of a program should be recognized by the
lexical analyzer, and which ones by the syntax analyzer?
• Both of them do similar things; But the lexical analyzer deals with
simple non-recursive constructs of the language.
• The syntax analyzer deals with recursive constructs of the language.
• The lexical analyzer simplifies the job of the syntax analyzer.
• The lexical analyzer recognizes the smallest meaningful units (tokens) in
a source program.
• The syntax analyzer works on the smallest meaningful units (tokens) in
a source program to recognize meaningful structures in our
programming language.
The Structure of a Compiler
58
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program
(Character Stream)
Tokens Syntactic
Structure
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Semantic Routines
 Perform two functions
 Check the static semantics of each construct
 Do the actual translation
 The heart of a compiler
 Result is: Syntax Directed Translation
 Semantic Processing Techniques
Ex:
newval = oldval + 12
The type of the identifier newval must match with type of the expression (oldval+12)
Target machine code
Semantic Analysis
type checking
type conversion
Symbol Table
• There is a record for each identifier
• The attributes include name, type, location, etc.
Synthesis of Object Code
intermediate code generator
code optimizer
code generator
parse tree & symbol table
intermediate code
optimized intermediate code
target program
The Structure of a Compiler
62
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program
(Character Stream)
Tokens Syntactic
Structure
Intermediate
Representation
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Intermediate Code Generation
 A compiler may produce an explicit intermediate codes representing the
source program.
 These intermediate codes are generally machine (architecture independent).
But the level of intermediate codes is close to the level
Target machine code
Intermediate Code Generation
The Structure of a Compiler
64
Scanner Parser
Semantic
Routines
Code
Generator
Optimizer
Source
Program Tokens Syntactic
Structure
Symbol and
Attribute
Tables
(Used by all
Phases of
The Compiler)
Optimizer
The IR code generated by the semantic routines is analyzed
and transformed into functionally equivalent but improved IR
code
This phase can be very complex and slow
Peephole optimization
loop optimization, register allocation, code scheduling
(Character Stream)
Intermediate
Representation
Target machine code
Code Optimization
The Structure of a Compiler
66
Source
Program
(Character Stream)
Scanner
Tokens
Parser
Syntactic
Structure
Semantic
Routines
Intermediate
Representation
Optimizer
Code
Generator
Code Generator
 Produces the target language in a specific
architecture.
 The target program is normally is a relocatable object
file containing the machine codes.
Target machine code
Code Generation
The Structure of a Compiler
68
Scanner
[Lexical Analyzer]
Parser
[Syntax Analyzer]
Semantic Process
[Semantic analyzer]
Code Generator
[Intermediate Code Generator]
Code Optimizer
Tokens
Parse tree
Abstract Syntax Tree w/ Attributes
Non-optimized Intermediate Code
Optimized Intermediate Code
Code Optimizer
Target machine code

More Related Content

Similar to Compilers.pptx

Introduction to computer programming
Introduction to computer programming Introduction to computer programming
Introduction to computer programming VanessaBuensalida
 
Embedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterEmbedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterHossam Hassan
 
introduction computer programming languages
introduction computer programming languages introduction computer programming languages
introduction computer programming languages BakhatAli3
 
Week 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdfWeek 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdfAnonymousQ3EMYoWNS
 
Introduction to Compilers
Introduction to CompilersIntroduction to Compilers
Introduction to CompilersAkhil Kaushik
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptxAshenafiGirma5
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introductionmengistu23
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxZiyadMohammed17
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compilerIffat Anjum
 
Introduct To C Language Programming
Introduct To C Language ProgrammingIntroduct To C Language Programming
Introduct To C Language Programmingyarkhosh
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & LanguagesGaditek
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & LanguagesGaditek
 
ProgFund_Lecture_1_Introduction_to_Programming.pdf
ProgFund_Lecture_1_Introduction_to_Programming.pdfProgFund_Lecture_1_Introduction_to_Programming.pdf
ProgFund_Lecture_1_Introduction_to_Programming.pdflailoesakhan
 
C Programming Lecture 1 - Introduction to C.pptx
C Programming Lecture 1 - Introduction to C.pptxC Programming Lecture 1 - Introduction to C.pptx
C Programming Lecture 1 - Introduction to C.pptxMurali M
 
C++ programming languages lectures
C++ programming languages lectures C++ programming languages lectures
C++ programming languages lectures jabirMemon
 

Similar to Compilers.pptx (20)

Introduction to computer programming
Introduction to computer programming Introduction to computer programming
Introduction to computer programming
 
programming.pptx
programming.pptxprogramming.pptx
programming.pptx
 
Embedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals masterEmbedded c c++ programming fundamentals master
Embedded c c++ programming fundamentals master
 
introduction computer programming languages
introduction computer programming languages introduction computer programming languages
introduction computer programming languages
 
Week 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdfWeek 08_Basics of Compiler Construction.pdf
Week 08_Basics of Compiler Construction.pdf
 
Introduction to Compilers
Introduction to CompilersIntroduction to Compilers
Introduction to Compilers
 
Plc part 1
Plc part 1Plc part 1
Plc part 1
 
4_5802928814682016556.pptx
4_5802928814682016556.pptx4_5802928814682016556.pptx
4_5802928814682016556.pptx
 
Cd ch1 - introduction
Cd   ch1 - introductionCd   ch1 - introduction
Cd ch1 - introduction
 
CD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptxCD - CH1 - Introduction to compiler design.pptx
CD - CH1 - Introduction to compiler design.pptx
 
Lecture 01 introduction to compiler
Lecture 01 introduction to compilerLecture 01 introduction to compiler
Lecture 01 introduction to compiler
 
Introduct To C Language Programming
Introduct To C Language ProgrammingIntroduct To C Language Programming
Introduct To C Language Programming
 
Ic lecture8
Ic lecture8 Ic lecture8
Ic lecture8
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & Languages
 
Programming Paradigm & Languages
Programming Paradigm & LanguagesProgramming Paradigm & Languages
Programming Paradigm & Languages
 
Chapter 4 computer language
Chapter 4 computer languageChapter 4 computer language
Chapter 4 computer language
 
ProgFund_Lecture_1_Introduction_to_Programming.pdf
ProgFund_Lecture_1_Introduction_to_Programming.pdfProgFund_Lecture_1_Introduction_to_Programming.pdf
ProgFund_Lecture_1_Introduction_to_Programming.pdf
 
C Programming Lecture 1 - Introduction to C.pptx
C Programming Lecture 1 - Introduction to C.pptxC Programming Lecture 1 - Introduction to C.pptx
C Programming Lecture 1 - Introduction to C.pptx
 
C++ programming languages lectures
C++ programming languages lectures C++ programming languages lectures
C++ programming languages lectures
 
Presentation-1.pptx
Presentation-1.pptxPresentation-1.pptx
Presentation-1.pptx
 

Recently uploaded

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfAlina Yurenko
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 

Recently uploaded (20)

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdfGOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
GOING AOT WITH GRAALVM – DEVOXX GREECE.pdf
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 

Compilers.pptx

  • 1. Compilers Design and Construction Lecture 1 Chapter 1
  • 2. • Any program written in a programming language must be translated before it can be executed. • This translation is typically accomplished by a software system called compiler. • This course aims to introduce the principles and techniques used to perform this translation and the issues that arise in the construction of a compiler. Course Aims
  • 3. Our course Talk about : 3
  • 4. Plan Week Subject Reading 1 Introduction Ch 1 2 Lexical analysis Ch 2-3 3,4 Syntax analysis Ch 4-5 5 Syntax, Semantic analysis Ch 4-5 Ch 6 6 Semantic analysis Ch 6 7 Mid 8 Intermediate code Ch 7-8 9,10,11 Control flow , Code generation Ch 9 –10 12 code optimization Ch 10
  • 5. Assessments Topic Mark lab 20% Mid Term Exam 15% Presence 5% Final Term Exam 60%
  • 6. 6 Learning Outcomes: • A student successfully completing this course should be able to: • understand the principles governing all phases of the compilation process. • understand the role of each of the basic components of a standard compiler. • show awareness of the problems of and methods and techniques applied to each phase of the compilation process. • apply standard techniques to solve basic problems that arise in compiler construction. • understand how the compiler can take advantage of particular processor characteristics to generate good code.
  • 7. References • Class textbook • Compilers: Principles, Techniques, and Tools by Aho, Sethi, and Ullman • Other useful books • Advanced Compiler Design & Implementation, Steven Muchnick • Building an Optimizing Compiler, Robert Morgan • Modern Compiler Implementation in Java, Andrew Appel
  • 8. Software Categories • System SW • Programs written for computer systems • Compilers, operating systems, … • Application SW • Programs written for computer users • Word-processors, spreadsheets, & other application packages
  • 9. A Layered View of a Computer from the perspective of compiler Machine with all its hardware System Software Compilers, Interpreters, Preprocessors, etc. Operating System, Device Drivers Application Programs Word-Processors, Spreadsheets, Database Software, IDEs, etc…
  • 10. Programs • Any program can be written in any programming language • A programming language(PL) is • A set of rules and symbols used to construct a computer program • A language used to interact with the computer
  • 11. 11 Why study Compilation Technology? • Success stories (one of the earliest branches in CS) • Applying theory to practice (scanning, parsing, static analysis) • Ideas from different parts of computer science are involved: • AI: Heuristic search techniques; greedy algorithms - Algorithms: graph algorithms - Theory: pattern matching - Also: Systems, Architecture • Compiler construction can be challenging and fun: • new architectures always create new challenges; success requires mastery of complex interactions; results are useful; opportunity to achieve performance.
  • 12. CS Expert Programmer Simple User Manually Problem that needs to be solved automatically Make SW to solve specific problem Make SW to compile any program
  • 13. 13 Principles of Compilation The compiler must: • preserve the meaning of the program being compiled. • “improve” the source code in some way. • Space (size of compiled code) • Feedback (information provided to the user) • Debugging • Compilation time efficiency (fast or slow compiler?)
  • 15. Compilers • “Compilation” • Translation of a program written in a source language into a semantically equivalent program written in a target language Compiler Error messages Source Program Target Program Input Output Target program : an executable machine-language program.
  • 16. Interpreters • “Interpretation” • Performing the operations implied by the source program Interpreter Source Program Input Output Error messages
  • 17. History IBM developed 704 in 1954. All programming was done in assembly language. Cost of software development far exceeded cost of hardware. Low productivity. • Speedcoding interpreter: programs ran about 10 times slower than hand written assembly code • John Backus (in 1954): Proposed a program that translated high level expressions into native machine code. Skeptism all around. Most people thought it was impossible • Fortran I project (1954-1957): The first compiler was released
  • 18. Fortran I • The first compiler had a huge impact on the programming languages and computer science. The whole new field of compiler design was started. • More than half the programmers were using Fortran by 1958. • The development time was cut down to half. • Modern compilers preserve the basic structure of the Fortran I compiler !!!
  • 19. Computer Languages – Machine Language • Uses binary code • Machine-dependent • Not portable • Assembly Language • Uses mnemonics(list of words to remembers) • Machine-dependent • Not usually portable • High-Level Language (HLL) • Uses English-like language • Portable (but must be compiled for different platforms) • Examples: Pascal, C, C++, Java, Fortran, . . .
  • 20. Machine Language • The representation of a computer program which is actually read and understood by the computer. • A program in machine code consists of a sequence of machine instructions. • Instructions: • Machine instructions are in binary code • Instructions specify operations and memory cells involved in the operation Example: Operation Address 0010 0000 0000 0100 0100 0000 0000 0101 0011 0000 0000 0110
  • 21. Assembly Language A symbolic representation of the machine language of a specific processor. Is converted to machine code by an assembler. Each line of assembly code produces one machine instruction (One-to-one correspondence). Programming in assembly language is slow and error-prone but is more efficient in terms of hardware performance. Mnemonic representation of the instructions and data Example: Load Price Add Tax Store Cost
  • 22. High-level language • A programming language which use statements consisting of English-like keywords such as "FOR", "PRINT" or “IF“, ... etc. • Each statement corresponds to several machine language instructions (one-to-many correspondence). • Much easier to program than in assembly language. • Operations can be described using familiar symbols • Example: Cost = Price + Tax
  • 24. Editors , Preprocessors , Linker & Loader • - Editors • Compiler have been bundled together with editor and other programs into an interactive development environment (IDE) • May include some operations of a compiler, informing some errors • - Preprocessors • Delete comments, include other files, and perform macro substitutions • - Linkers • Collect separate object files into a directly executable file • Connect an object program to the code for standard library functions and to resource supplied by OS • - Loaders • Resolve all re-locatable address relative to a given base • Make executable code more flexible
  • 25. Compiling and running C programs Editor Compiler Linker Source code file.c Object code file.obj Executable code file.exe Libraries
  • 26. Debuggers • Used to determine execution error in a compiled program • Keep tracks of most or all of the source code information • Stop execution at pre-specified locations called breakpoints
  • 27. Debugging programerrors Editor Compiler Linker Source code file.c Object code file.obj Executable code file.exe Libraries Syntactic Errors Semantic Errors
  • 28. Interpreters • Execute the source program immediately rather than generating object code • Examples: BASIC, LISP, used often in educational or development situations • Speed of execution is slower than compiled code • Share many of their operations with compilers
  • 29. How to translate? • Direct translation is difficult. Why? • • Source code and machine code mismatch in level of abstraction • – Variables vs Memory locations/registers • – Functions vs jump/return • – Parameter passing • – structs • • Some languages are farther from machine code than others • – For example, languages supporting Object Oriented Paradigm
  • 30. How to translate easily? • Translate in steps. Each step handles a reasonably simple, logical, and well defined task • • Design a series of program representations • • Intermediate representations should be amenable to program manipulation of various kinds (type checking, optimization, code generation etc.) • • Representations become more machine specific and less language specific as the translation proceeds
  • 31. The first few steps • The first few steps can be understood by analogies to how humans comprehend a natural language • • The first step is recognizing/knowing alphabets of a language. For example • – English text consists of lower and upper case alphabets, digits, punctuations and white spaces • –Written programs consist of characters from the ASCII characters set (normally 9-13, 32-126)
  • 32. The first few steps • The next step to understand the sentence is recognizing words • –How to recognize English words? • –Words found in standard dictionaries • –Dictionaries are updated regularly
  • 33. The first few steps • How to recognize words in a programming language? • – a dictionary (of keywords etc.) • – rules for constructing words (identifiers, numbers etc.) • • This is called lexical analysis • • Recognizing words is not completely trivial. • For example: w hat ist his se nte nce?
  • 34. Lexical Analysis: Challenges • • We must know what the word separators are • • The language must define rules for breaking a sentence into a sequence of words. • • Normally white spaces and punctuations are word separators in languages.
  • 35. Lexical Analysis: Challenges • • In programming languages a character from a different class may also be treated as word separator. • • The lexical analyzer breaks a sentence into a sequence of words or tokens: • – If a == b then a = 1 ; else a = 2 ; • – Sequence of words (total 14 words) • if a == b then a = 1 ; else a = 2 ;
  • 36. The next step • • Once the words are understood, the next step is to understand the structure of the sentence • • The process is known as syntax checking or parsing
  • 37. Parsing Parsing a program is exactly the same process as shown in previous slide. • Consider an expression if x == y then z = 1 else z = 2
  • 38. Understanding the meaning • • Once the sentence structure is understood we try to understand the meaning of the sentence (semantic analysis) • • A challenging task • • Example: Prateek said Nitin left his assignment at home • • What does his refer to? Prateek or Nitin?
  • 39. Understanding the meaning • • Worse case Amit said Amit left his assignment at home • • Even worse Amit said Amit left Amit’s assignment at home • • How many Amits are there? Which one left the assignment? Whose assignment got left?
  • 40. Semantic Analysis • • Too hard for compilers. • They do not have capabilities similar to human understanding • • However, compilers do perform analysis to understand the meaning and catch inconsistencies • • Programming languages define strict rules to avoid such ambiguities • { int Amit = 3; { int Amit = 4; cout << Amit; } • }
  • 41. More on Semantic Analysis • • Compilers perform many other checks besides variable bindings • • Type checking Amit left her work at home • • There is a type mismatch between her and Amit. Presumably Amit is a male. And they are not the same person.
  • 42. Code Optimization • • No strong counter part with English, but is similar to editing/précis writing • • Automatically modify programs so that they • –Run faster • –Use less resources (memory, registers, space, fewer fetches etc.)
  • 43. Code Optimization • • Some common optimizations • –Common sub-expression elimination • –Copy propagation • –Dead code elimination • –Code motion • –Strength reduction • –Constant folding • • Example: x = 15 * 3 is transformed to x = 45
  • 45. Compilers • Analysis of the source program. • Synthesis into a machine-language program. 1 2 3
  • 46. Parts of Compilers 1. Lexical Analysis 2. Syntax Analysis 3. Semantic Analysis 4. Code Generation 5. Optimization Analysis Synthesis Front End Back End
  • 47. Compilers • Analysis • Front End • Split source code into different constitute pieces(token). • Put the pieces based on grammatical rules(Parse). • Report Errors. • Synthesis • Back End • Produce intermediate code • Optimize Intermediate code • Generate target code(machine language code)
  • 48. 48 Structure of a Compiler • Front end: analysis • Read source program and understand its structure and meaning • Back end: synthesis • Generate equivalent target language program Source Target Front End Back End
  • 49. Phases of a Compiler 49 Code Generator Code Optimizer Intermediate Code Generator Semantic Analyzer Syntax Analyzer Lexical Analyzer Error Handler Symbol Table Manager Target Program Source Program
  • 50. The Structure of a Compiler 50 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) (Character Stream) Intermediate Representation Target machine code Analysis phase Synthesis phase
  • 51. by Neng-Fa Zhou Analysis source program lexical analyzer syntax analyzer semantic analyzer source program tokens parse trees parse trees
  • 52. The Structure of a Compiler 52 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Scanner (Lexical Analyzer) The scanner begins the analysis of the source program by reading the input, character by character, and grouping characters into individual words and symbols (tokens)  Puts information about identifiers into the symbol table.  Regular expressions are used to describe tokens (lexical constructs).  A (Deterministic) Finite State Automaton can be used in the implementation of a lexical analyzer. (Character Stream) Intermediate Representation Target machine code
  • 53. 53 Scanner (Lexical Analyzer) Ex: newval = oldval + 12 tokens: newval identifier = assignment operator oldval identifier + add operator 12 a number tokens
  • 54. The Structure of a Compiler 54 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Parser (Syntax Analyzer)  Given a formal syntax specification (typically as a context-free grammar [CFG] ), the parse reads tokens and groups them into units as specified by the productions of the CFG being used.  As syntactic structure is recognized, the parser either calls corresponding semantic routines directly or builds a syntax tree.  CFG ( Context-Free Grammar )  BNF ( Backus-Naur Form )  GAA ( Grammar Analysis Algorithms ) (Character Stream) Intermediate Representation Target machine code
  • 55. 55 Parser (Syntax Analyzer) • A Syntax Analyzer creates the syntactic structure (generally a parse tree) of the given program. • A syntax analyzer is also called as a parser. • A parse tree describes a syntactic structure. parse tree
  • 56. 56 Parser (Syntax Analyzer (CFG) ) • The syntax of a language is specified by a context free grammar (CFG). • The rules in a CFG are mostly recursive. • A syntax analyzer checks whether a given program satisfies the rules implied by a CFG or not. • If it satisfies, the syntax analyzer creates a parse tree for the given program. • Ex: We use BNF (Backus Naur Form) to specify a CFG assgstmt -> identifier := expression expression -> identifier expression -> number expression -> expression + expression
  • 57. 57 Syntax Analyzer versus Lexical Analyzer • Which constructs of a program should be recognized by the lexical analyzer, and which ones by the syntax analyzer? • Both of them do similar things; But the lexical analyzer deals with simple non-recursive constructs of the language. • The syntax analyzer deals with recursive constructs of the language. • The lexical analyzer simplifies the job of the syntax analyzer. • The lexical analyzer recognizes the smallest meaningful units (tokens) in a source program. • The syntax analyzer works on the smallest meaningful units (tokens) in a source program to recognize meaningful structures in our programming language.
  • 58. The Structure of a Compiler 58 Scanner Parser Semantic Routines Code Generator Optimizer Source Program (Character Stream) Tokens Syntactic Structure Intermediate Representation Symbol and Attribute Tables (Used by all Phases of The Compiler) Semantic Routines  Perform two functions  Check the static semantics of each construct  Do the actual translation  The heart of a compiler  Result is: Syntax Directed Translation  Semantic Processing Techniques Ex: newval = oldval + 12 The type of the identifier newval must match with type of the expression (oldval+12) Target machine code
  • 60. Symbol Table • There is a record for each identifier • The attributes include name, type, location, etc.
  • 61. Synthesis of Object Code intermediate code generator code optimizer code generator parse tree & symbol table intermediate code optimized intermediate code target program
  • 62. The Structure of a Compiler 62 Scanner Parser Semantic Routines Code Generator Optimizer Source Program (Character Stream) Tokens Syntactic Structure Intermediate Representation Symbol and Attribute Tables (Used by all Phases of The Compiler) Intermediate Code Generation  A compiler may produce an explicit intermediate codes representing the source program.  These intermediate codes are generally machine (architecture independent). But the level of intermediate codes is close to the level Target machine code
  • 64. The Structure of a Compiler 64 Scanner Parser Semantic Routines Code Generator Optimizer Source Program Tokens Syntactic Structure Symbol and Attribute Tables (Used by all Phases of The Compiler) Optimizer The IR code generated by the semantic routines is analyzed and transformed into functionally equivalent but improved IR code This phase can be very complex and slow Peephole optimization loop optimization, register allocation, code scheduling (Character Stream) Intermediate Representation Target machine code
  • 66. The Structure of a Compiler 66 Source Program (Character Stream) Scanner Tokens Parser Syntactic Structure Semantic Routines Intermediate Representation Optimizer Code Generator Code Generator  Produces the target language in a specific architecture.  The target program is normally is a relocatable object file containing the machine codes. Target machine code
  • 68. The Structure of a Compiler 68 Scanner [Lexical Analyzer] Parser [Syntax Analyzer] Semantic Process [Semantic analyzer] Code Generator [Intermediate Code Generator] Code Optimizer Tokens Parse tree Abstract Syntax Tree w/ Attributes Non-optimized Intermediate Code Optimized Intermediate Code Code Optimizer Target machine code