Python fundamentals
Data model - everything is an object
|
|
Lexical analysis
What is a Python program read by? A parser, which reads a sequence of tokens produced by a lexical analyser.
What is a legixal analyser? A program that performs lexical analysis.
What is lexical analysis, and what’s the etymology of the term? Lexical analysis, also called lexing or tokenizing, is the process of converting a sequence of characters into a sequence of tokens (src). The root of the term is the Greek lexis, meaning word.
What is a token? A string with an identified meaning, structured as a name-value tuple. Common token names (aking to parts of speech in linguistics) are identifier, keyword, separator, operator, literal, comment. For the expression
a = b + 2
, lexical analysis would yield the sequence of tokens[(identifier, a), (operator, =), (identifier, b), (operator, +), (literal, 2)]
(src).What’s the difference between lexical and syntactic definitions? The former operates on the individual characters of the input source (during tokenizing), the latter on the stream of tokens created by lexial analysis (during parsing).
By default, Python reads text as Unicode code points. Hence, encoding declarations are only needed if some other encoding is required.