CS256 - Principles of Structured Design

Fall 2021

Introduction to C Programming:

Reading:

Chapter 1: Intro
Chapter 2: Overview
Chapter 3: Setup
- You can ignore the part about installation
Chapter 4: Program Structure
Chapter 5: Basic Syntax
Chapter 6: Data Types

How to program:

The computer is like a train, the tracks the train run on are the program the computer follows. You lay down the tracks to get the train to where you want it to go.

Programming strategies:

Top-down:
Breaks down a large problem into smaller and smaller "chunks", usually into functions and sub-functions, until discrete solvable problems at the lowest layer are resolved. This is like outlining a story.
Bottom-up:
Starts with basic building blocks (think Lego) and builds discrete units, resolving the way in which they work together as higher orders emerge from the combined lower basis.
Upsides and downsides to both, best strategy is to combine and use both methods.
Defining and designing the data structures and patterns first often leads to the program design itself, so start with the data. The data is often the most important aspect of a program.

Structured programming:

A programming paradigm that makes extensive use of subroutines, block structures, for and while loops.
Emerged in the 1960s after "GoTo Statement Considered Harmful" paper and the introduction of the ALGOL programming language.
Gave rise to:
- Procedural programming, which gave rise to:
  - Object oriented programming

Example of unstructured vs structured:

Unstructured (BASIC):		Structured (C):
`10 I=0 20 PRINT "Hello, World!" 30 I=I+1 40 IF I<10 THEN GOTO 20`		`for(i=0; i < 10; i=i+1) { printf("Hello, world!"); }`

There is some difference in the order of operations, but otherwise they do the same thing: print 'Hello, world!' 10 times). In the BASIC example the flow of the code is controlled using the GOTO statements and there is no apparent "structure" to the code, whereas in the C example the print statement is a child of the for statement, giving the program a more tree like structure.

Elements of structured programming:

Reading: Chapter 4: Program Structure
Sequence: ordered statements executed in sequence.
Selection: Statements executed dependent on state of the program.
(if - else, switch/case)
Iteration: Statements executed until the program reaches a certain state. (while, do-while, for)
- Loops should have only one entry point.
- Originally only one exit point, but few languages enforce this.
Subroutines:
- Callable units such as procedures, functions, methods or sub-programs.
- Allow a sequence to be referred to by a single statement.
Blocks:
- Allow groups of statements to be treated as if they were one statement.
- {...} in C.
- if ... fi (ALGOL/Bourn shells) , begin ... end (Pascal/others)
- white-space (Python)
Early exit:
- A way of altering the flow of execution usually with respect to edge cases.
- break - Exits loop or switch immediately.
- continue - Moves to the next iteration of a loop.
- return - Exits subroutine immediately.
- exit - Exits program immediately.

Flow-charts:

Flowcharts are a visual representation of program logic. Wikipedia entry on flowcharts They describe what the grammar means in terms of what the computer will do.
Common Symbols: The most common ones and ones you should be aware of are the:
- Flowline
- Terminal
- Process
- Decision
- Input/Output

Computer grammars:

Reading Chapter 5: Basic Syntax

The grammar of a language is the set of rules and structure of a language, and like natural languages (such as English), computer languages also have grammar that needs to be followed.

It is very important to understand the grammar of a computer language. It is the means by which you can "communicate" your intentions to the computer, so if you do not understand or remember the grammar, you cannot communicate to the computer. Your programs will not compile (the phase where the computer attempts to interpret your intentions and transform it into instructions it will follow,) thus you cannot learn to program without first learning the grammar and structure of a programming language.

Fortunately computer grammars tend to be what is called "formal", i.e. they are "well defined", in that they reduce as much as possible ambiguity in meaning of the words and symbols of the grammar and tend to have quite simple and strict rules that once learned can usually be applied quickly to other programming languages that use similar grammars. Computer languages also tend to have very few words and symbols compared with natural languages (a few hundred, compared to 10s of thousands) and you should already be familiar with a number of them from having already learned English and Math.

Like natural languages, computer grammars are "recursive", i.e. the language is defined like a set of fill in the blank ad-libs, such as:

      if ( __________ ) _________
           expression   statement

The above is itself a "statement" (specifically an "if-statement"), which may be inserted in it's entirety where the statement is specified, such that

      if (a) s

is valid grammar where a (and b and c below) is an "expression", and 's' is a statement of some sort. The following is also valid grammar:

      if (a) if (b) s

where (if (b) s) is a complete if-statement that replaces the s in the first (if (a) s) example. Furthermore:

      if (a) if (b) if (c) s

is also valid, and so forth. If we draw {}'s around each "statement" the last example would look like:

      {
        if (a) {
          if (b) {
            if (c) { s }
          }
        }
      }

which is perfectly valid grammar in a C like language.

Some C grammars:

exp == an expression that evaluates to true or false (i.e. Boolean)
stmt == A statement

`type var,var,...; type var=exp;`	Variable declaration
`exp;`	An expression statement
`var = exp;`	An assignment expression statement
`{ stmt1; ...}`	Block of statements, equivalent to just one statement
`if (exp) stmt`	Do s if e is true
`if (exp) stmt1 else stmt2`	Do `stmt1` if `exp` is true, otherwise do `stmt2`
`while (exp) stmt`	Do `stmt` over and over as long as `exp` remains true
`do { stmt } while (exp);`	Do `stmt` repeated as long as `exp` is true, always does `stmt` at least once.
`for(init-exp;test-exp;inc-exp) stmt`	More or less equivalent to: `init-exp; while(test-exp) { stmt; inc-exp }`
`break;`	Exit a loop immediately.
`continue;`	Continue to the next loop iteration immediately.
`return-type name (variable-declarations) { stmt1; ... }`	A function/procedure
`return exp;`	Gives a function its value.

Functions / procedures / sub-routines:

More reading: Chapter 13: Functions

A function is a some operation that may take some input data and process it and output some result based on the input data. An example of a function is addition.

Example Notation

(a + b) Infix notation (operator is "in"-between the operands)

(+ a b) Prefix notation (Polish Notation) (LISP, C functions.) Operator comes before the operands.

(a b +) Postfix notation (Reverse Polish Notation (RPN).) Operator follows the operands.

The + function requires two (or more in the case of prefix or postfix notation) inputs, which are operated on and the result of adding them together.

In expressions, the inputs are also recursive, for example in the above a or b can be complete sub-expressions (i.e. a could = 5, or a = (x+2), etc.)

Prefix to infix example:
(+ (- a b) c) is the same as (a - b) + c

which is equivalent to:

add( subtract(a, b), c)

In C function calls are prefix expressions where the operands, enclosed in parenthesis and separated by commas, follow the function name. The operands to a function can be any expression, which may include function calls themselves.

Procedures / sub-routines: A function that may not require any input, nor need to return a value may be created, usually to combine code that needs to be called repeatedly or from different areas of the program. This gives a name to the code and allows you to stop thinking about the code inside of the procedure and just think of its function making the program easier to mentally manage as well as reducing the amount of coding necessary in general.

If you do something in your program more than twice, it is usually a good idea to consider making a function or procedure to do that thing.

In C, the return keyword gives a function it's value, and causes the computer to leave the function immediately as well.

    int add(int a, int b) {
     │       └──────┴───────── Data-type for the inputs, a and b.
     └──────────────────────── Data-type for the returned value for the function.
      return a + b;
             ╘═══╛──────────── The value of this function that is returned.
    }

For now, we won't be using functions too much, but you will always require at least one function for a C program, and that is the "main" function, which is the starting point to all C programs. Also when you leave the main function (via return), the program will complete. The return type for main is always an integer (int), so when defining main it should look like:

    int main(void)
    {
      ***Your code here***

      return 0;
    }

The number being returned by main is used by the shell to determine if the program was successful or not. Usually we don't care, but here 0 means the program was successful, and any other number from 1 to 255 means it failed in some way (the number usually indicating which way it failed.)

The void parameter to main means we don't need any parameters to main, main can optionally be written like:

    int main(int argc, char *argv[])
    ...

which would allow our main function to get command line parameters from the user which would be stored in the argv[] array.

Data-types:

Reading: Chapter 6: C Data Types

What is data? In a computer the most fundamental unit of data is the "bit" which represents one of two possible states (off/0 or on/1). By combining bits together into various sizes (bytes (8 bits), words (16 bits), long words (32 bits) or quad words(64 bits)) the computer can represent integers of various sizes.

Integer size: C type: unsigned range: signed range:

1 bit Bool 0-1 0-1

8 bits char 0-255 -128 - 127

16 bits short 0-65535 -32768 - 32767

32 bits int 0-~4 billion +/- ~2 billion

64 bits long 0-~18 quintillion +/-9 quintillion

By default C integers are signed (i.e. can be negative values.) To make them unsigned, you prefix the type with the "unsigned" modifier. ex:

unsigned int
unsigned char

Negative numbers, floating point values, strings, arrays or other kinds of structured data arise from how the computer interprets the grouping of bits.

ASCII characters for example are 7 bit numeric values (stored in a 8 bit byte.) values. Use 'man ascii' from the command line to view the ASCII table.

value characters:

0 - 31 "control characters" Invisible, usually non-printable characters

10 '\n' (newline)

32 ' ' (space)

48 - 57 '0' - '9' (Numeric digit characters)

65 - 90 'A' - 'Z' (Upper-case letters)

97 - 122 'a' - 'z' (Lower-case letters)

In C, the ASCII value of a character can be had by enclosing the character in single quotes, i.e. 'A' is 65, 'B' is 66, etc. In general it is considered better to use 'A' instead of 65 when comparing letters as 65 is considered a "magic number", i.e. it isn't clear what 65 means. 'A' is more meaningful to someone reading your code than 65 is.

A string is a sequence (array) of ASCII characters (encoded into 8 bit bytes) enclosed inside of double quotes. Strings are terminated by a null byte (numeric 0 or '\0'), though you don't see the null character in your string. Thus:

"Hello" is the sequence of ASCII characters:

Character: 'H' 'e' 'l' 'l' 'o' '\0'
Numeric value: 72, 101, 108, 108, 111, 0

The null character at the end of the string is very important. Without it The C compiler cannot know how long the string is, and thus when to stop processing the string.

Floating point values (allow fractional values, like 1.5, 2.6e135)

C-type: size: range: Precision

float 32 bits ~ 10^+/-38 6 digits

double 64 bits ~ 10^+/-308 15 digits

long double 80 bits ~ 10^+/-4932 19 digits

Floating point values are not always exact (for example 1/3 or .1 are just approximations) and they are always signed, unlike unsigned integers.

Any number with a decimal place in it is a floating point value, thus: .5, 0.5, 10., .100 are all floating point values.

Floats can be specified in engineering format, for example: 5.2e-20 which is the same as saying: 5.2 x 10^-20.