|
CS256 - Principles of Structured Design
Fall 2021
| Introduction to C Programming:
Reading:
How to program:
The computer is like a train, the tracks the train run on are the program
the computer follows. You lay down the tracks to get the train to where
you want it to go.
Programming strategies:
-
Top-down:
Breaks down a large problem into smaller and smaller "chunks", usually into
functions and sub-functions, until discrete solvable problems at the lowest
layer are resolved. This is like outlining a story.
-
Bottom-up:
Starts with basic building blocks (think Lego) and builds discrete units,
resolving the way in which they work together as higher orders emerge from
the combined lower basis.
-
Upsides and downsides to both, best strategy is to combine and use both
methods.
-
Defining and designing the data structures and patterns first often leads
to the program design itself, so start with the data. The data is often the
most important aspect of a program.
Structured programming:
-
A programming paradigm that makes extensive use of subroutines, block
structures, for and while loops.
-
Emerged in the 1960s after "GoTo Statement Considered Harmful" paper and
the introduction of the ALGOL programming language.
-
Gave rise to:
Example of unstructured vs structured:
Unstructured (BASIC): | | Structured (C):
|
---|
10 I=0
20 PRINT "Hello, World!"
30 I=I+1
40 IF I<10 THEN GOTO 20
| |
for(i=0; i < 10; i=i+1) {
printf("Hello, world!");
}
|
There is some difference in the order of operations, but otherwise they do the
same thing: print 'Hello, world!' 10 times). In the BASIC example the flow of
the code is controlled using the GOTO statements and there is no apparent
"structure" to the code, whereas in the C example the print statement is a
child of the for statement, giving the program a more tree like structure.
Elements of structured programming:
-
Reading: Chapter 4: Program Structure
-
Sequence: ordered statements executed in sequence.
-
Selection: Statements executed dependent on state of the program.
(if - else , switch /case )
-
Iteration: Statements executed until the program reaches a certain
state. (while , do -while , for )
- Loops should have only one entry point.
- Originally only one exit point, but few languages enforce this.
-
Subroutines:
- Callable units such as procedures, functions, methods or sub-programs.
- Allow a sequence to be referred to by a single statement.
-
Blocks:
- Allow groups of statements to be treated as if they were one statement.
{ ...} in C.
if ... fi (ALGOL/Bourn shells) , begin ... end (Pascal/others)
- white-space (Python)
-
Early exit:
- A way of altering the flow of execution usually with respect to edge cases.
break - Exits loop or switch immediately.
continue - Moves to the next iteration of a loop.
return - Exits subroutine immediately.
exit - Exits program immediately.
Flow-charts:
-
Flowcharts are a visual representation of program logic.
Wikipedia entry on flowcharts
They describe what the grammar means in terms of what the computer will do.
-
Common Symbols:
The most common ones and ones you should be aware of are the:
- Flowline
- Terminal
- Process
- Decision
- Input/Output
Computer grammars:
The grammar of a language is the set of rules and structure of a language,
and like natural languages (such as English), computer languages also have
grammar that needs to be followed.
It is very important to understand the grammar of a computer language. It is
the means by which you can "communicate" your intentions to the computer, so
if you do not understand or remember the grammar, you cannot communicate to
the computer. Your programs will not compile (the phase where the computer
attempts to interpret your intentions and transform it into instructions it
will follow,) thus you cannot learn to program without first learning the
grammar and structure of a programming language.
Fortunately computer grammars tend to be what is called "formal", i.e. they
are "well defined", in that they reduce as much as possible ambiguity in
meaning of the words and symbols of the grammar and tend to have quite
simple and strict rules that once learned can usually be applied quickly to
other programming languages that use similar grammars. Computer languages
also tend to have very few words and symbols compared with natural languages
(a few hundred, compared to 10s of thousands) and you should already be
familiar with a number of them from having already learned English and Math.
Like natural languages, computer grammars are "recursive", i.e. the language
is defined like a set of fill in the blank ad-libs, such as:
if ( __________ ) _________
expression statement
The above is itself a "statement" (specifically an "if-statement"), which
may be inserted in it's entirety where the statement is specified, such that
if (a) s
is valid grammar where a (and b and c below) is an "expression", and 's'
is a statement of some sort. The following is also valid grammar:
if (a) if (b) s
where (if (b) s) is a complete if-statement that replaces the s in the
first (if (a) s) example. Furthermore:
if (a) if (b) if (c) s
is also valid, and so forth. If we draw {}'s around each "statement" the last example
would look like:
{
if (a) {
if (b) {
if (c) { s }
}
}
}
which is perfectly valid grammar in a C like language.
Some C grammars:
exp == an expression that evaluates to true or false (i.e. Boolean)
stmt == A statement
type var,var,...;
type var=exp;
|
Variable declaration
|
exp;
|
An expression statement
|
var = exp;
|
An assignment expression statement
|
{ stmt1; ...}
|
Block of statements, equivalent to just one statement
|
if (exp) stmt
|
Do s if e is true
|
if (exp) stmt1
else stmt2
|
Do `stmt1` if `exp` is true, otherwise do `stmt2`
|
while (exp) stmt
|
Do `stmt` over and over as long as `exp` remains true
|
do {
stmt
} while (exp);
|
Do `stmt` repeated as long as `exp` is true, always does `stmt` at least once.
|
for(init-exp;test-exp;inc-exp) stmt
|
More or less equivalent to:
init-exp;
while(test-exp) {
stmt;
inc-exp
}
|
break;
|
Exit a loop immediately.
|
continue;
|
Continue to the next loop iteration immediately.
|
return-type name (variable-declarations)
{
stmt1;
...
}
|
A function/procedure
|
return exp;
|
Gives a function its value.
|
Functions / procedures / sub-routines:
A function is a some operation that may take some input data and process it
and output some result based on the input data. An example of a function is
addition.
Example |
Notation |
(a + b) |
Infix notation (operator is "in"-between the operands) |
(+ a b) |
Prefix notation (Polish Notation) (LISP, C functions.) Operator comes before the operands. |
(a b +) |
Postfix notation (Reverse Polish Notation (RPN).) Operator follows the operands. |
The + function requires two (or more in the case of prefix or postfix notation)
inputs, which are operated on and the result of adding them together.
In expressions, the inputs are also recursive, for example in the above a or
b can be complete sub-expressions (i.e. a could = 5, or a = (x+2) , etc.)
Prefix to infix example:
(+ (- a b) c) is the same as (a - b) + c
which is equivalent to:
add( subtract(a, b), c)
In C function calls are prefix expressions where the operands, enclosed in
parenthesis and separated by commas, follow the function name. The operands
to a function can be any expression, which may include function calls
themselves.
Procedures / sub-routines: A function that may not require any input, nor
need to return a value may be created, usually to combine code that needs to
be called repeatedly or from different areas of the program. This gives a
name to the code and allows you to stop thinking about the code inside of the
procedure and just think of its function making the program easier to mentally
manage as well as reducing the amount of coding necessary in general.
If you do something in your program more than twice, it is usually a good
idea to consider making a function or procedure to do that thing.
In C, the return keyword gives a function it's value, and causes the computer
to leave the function immediately as well.
int add(int a, int b) {
│ └──────┴───────── Data-type for the inputs, a and b.
└──────────────────────── Data-type for the returned value for the function.
return a + b;
╘═══╛──────────── The value of this function that is returned.
}
For now, we won't be using functions too much, but you will always require
at least one function for a C program, and that is the "main " function,
which is the starting point to all C programs. Also when you leave the
main function (via return ), the program will complete. The return type for
main is always an integer (int), so when defining main it should look like:
int main(void)
{
***Your code here***
return 0;
}
The number being returned by main is used by the shell to determine if the
program was successful or not. Usually we don't care, but here 0 means the
program was successful, and any other number from 1 to 255 means it failed
in some way (the number usually indicating which way it failed.)
The void parameter to main means we don't need any parameters to
main , main can optionally be written like:
int main(int argc, char *argv[])
...
which would allow our main function to get command line parameters
from the user which would be stored in the argv[] array.
Data-types:
What is data? In a computer the most fundamental unit of data is the "bit"
which represents one of two possible states (off/0 or on/1). By combining
bits together into various sizes (bytes (8 bits), words (16 bits), long
words (32 bits) or quad words(64 bits)) the computer can represent integers
of various sizes.
Integer size: |
C type: |
unsigned range: |
signed range: |
1 bit |
Bool |
0-1 |
0-1 |
8 bits |
char |
0-255 |
-128 - 127 |
16 bits |
short |
0-65535 |
-32768 - 32767 |
32 bits |
int |
0-~4 billion |
+/- ~2 billion |
64 bits |
long |
0-~18 quintillion |
+/-9 quintillion |
By default C integers are signed (i.e. can be negative values.) To make them
unsigned, you prefix the type with the "unsigned" modifier. ex:
unsigned int
unsigned char
Negative numbers, floating point values, strings, arrays or other kinds of
structured data arise from how the computer interprets the grouping of bits.
ASCII characters for example are 7 bit numeric values (stored in a 8 bit
byte.) values. Use 'man ascii' from the command line to view the ASCII table.
value |
characters: |
0 - 31 |
"control characters" Invisible, usually non-printable characters |
10 |
'\n' (newline) |
32 |
' ' (space) |
48 - 57 |
'0' - '9' (Numeric digit characters) |
65 - 90 |
'A' - 'Z' (Upper-case letters) |
97 - 122 |
'a' - 'z' (Lower-case letters) |
In C, the ASCII value of a character can be had by enclosing the character in
single quotes, i.e. 'A' is 65, 'B' is 66, etc. In general it is considered
better to use 'A' instead of 65 when comparing letters as 65 is considered
a "magic number", i.e. it isn't clear what 65 means. 'A' is more meaningful
to someone reading your code than 65 is.
A string is a sequence (array) of ASCII characters (encoded into 8 bit bytes)
enclosed inside of double quotes. Strings are terminated by a null byte
(numeric 0 or '\0' ), though you don't see the null character in your string.
Thus:
"Hello" is the sequence of ASCII characters:
Character: | 'H' | 'e' | 'l' | 'l' | 'o' | '\0'
| Numeric value: | 72, | 101, | 108, | 108, | 111, | 0
|
- The null character at the end of the string is very important. Without it
The C compiler cannot know how long the string is, and thus when to stop
processing the string.
Floating point values (allow fractional values, like 1.5, 2.6e135)
C-type: |
size: |
range: |
Precision |
float |
32 bits |
~ 10^+/-38 |
6 digits |
double |
64 bits |
~ 10^+/-308 |
15 digits |
long double |
80 bits |
~ 10^+/-4932 |
19 digits |
Floating point values are not always exact (for example 1/3 or .1 are just
approximations) and they are always signed, unlike unsigned integers.
Any number with a decimal place in it is a floating point value, thus:
.5 , 0.5 , 10. , .100 are all floating point values.
Floats can be specified in engineering format, for example: 5.2e-20 which
is the same as saying: 5.2 x 10 -20 .
|