# Difference between revisions of "Algorithms and Data Structures - Getting Started"

(→Reading) |
|||

(8 intermediate revisions by 2 users not shown) | |||

Line 1: | Line 1: | ||

''This page is part of [[Programming and CS - Getting Started]]'' | ''This page is part of [[Programming and CS - Getting Started]]'' | ||

− | |||

− | |||

=Reading= | =Reading= | ||

Line 10: | Line 8: | ||

=Data Structures Basics= | =Data Structures Basics= | ||

For each of the following basic data structures, you should have memorized the algorithms for insert, lookup, delete, the worst-case / average-case / best-case running times (including why the running time is what it is), uses of each data structure, and be able to use the data structure in your programs. | For each of the following basic data structures, you should have memorized the algorithms for insert, lookup, delete, the worst-case / average-case / best-case running times (including why the running time is what it is), uses of each data structure, and be able to use the data structure in your programs. | ||

− | * Array (unsorted) - O(1) insert, O(n) lookup using linear search, O(1) delete if you have an index to what is being deleted. | + | |

− | * Array (sorted) - O(n) insert, O(log(n)) lookup using binary search, O(n) delete because need to shift to keep in sorted order. | + | Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU. |

− | * Linked list - O(1) insert, O(n) lookup using linear search, O(1) delete. | + | |

− | * Arrays / linked lists - used when data is small enough that O(n) running time is acceptable. Code is very simple. | + | * '''Array (unsorted)''' - O(1) insert, O(n) lookup using linear search, O(1) delete if you have an index to what is being deleted. |

− | * Stack - stored using a linked list or array, one use is in keeping track of local variables in a running program, another is arithmetic using RPN. | + | * '''Array (sorted)''' - O(n) insert, O(log(n)) lookup using binary search, O(n) delete because need to shift to keep in sorted order. |

− | * Queue - stored using a linked list, used when there is a "line" waiting for some resource where the first in the line gets the resource first. | + | * '''Linked list''' - O(1) insert, O(n) lookup using linear search, O(1) delete. |

− | * Binary search tree (not necessarily balanced) - operations are O(height of tree), which can be n in the worst-case of a completely unbalanced tree but is typically O(log(n)) on random data. Code for operations is very simple. BSTs used when data is large enough that array / linked list too slow, preferred over hash table if you might need to consider a range of values (e.g., all people with last name starting with Joh). | + | * '''Arrays / linked lists''' - used when data is small enough that O(n) running time is acceptable. Code is very simple. |

− | * Binary search tree (balanced) - operations are O(height of tree), tree height is kept O(log(n)) by rebalancing/shifting the tree if it becomes too unbalanced. | + | * '''Stack''' - stored using a linked list or array, one use is in keeping track of local variables in a running program, another is arithmetic using RPN. |

− | * Hash table (deal with collisions using chaining/buckets) - operations are O(size of linked list buckets), size of buckets is O(1) if size of hash table chosen appropriately and data is random, operations are n in worst case (all items collide and stay in one bucket). Hash table used when lookup/insert/delete need to be fast and having data stored in sorted order does not matter. | + | * '''Queue''' - stored using a linked list, used when there is a "line" waiting for some resource where the first in the line gets the resource first. |

− | * Hash table (deal with collisions using probing) - performance similar to hash table with chaining/buckets but could be worse as the hash table fills up. | + | * '''Binary search tree (not necessarily balanced)''' - operations are O(height of tree), which can be n in the worst-case of a completely unbalanced tree but is typically O(log(n)) on random data. Code for operations is very simple. BSTs used when data is large enough that array / linked list too slow, preferred over hash table if you might need to consider a range of values (e.g., all people with last name starting with Joh). |

− | * Heap - operations are O(log(n)), used when need to have access to the largest or smallest item. | + | * '''Binary search tree (balanced)''' - operations are O(height of tree), tree height is kept O(log(n)) by rebalancing/shifting the tree if it becomes too unbalanced. |

+ | * '''Hash table (deal with collisions using chaining/buckets)''' - operations are O(size of linked list buckets), size of buckets is O(1) if size of hash table chosen appropriately and data is random, operations are n in worst case (all items collide and stay in one bucket). Hash table used when lookup/insert/delete need to be fast and having data stored in sorted order does not matter. | ||

+ | * '''Hash table (deal with collisions using probing)''' - performance similar to hash table with chaining/buckets but could be worse as the hash table fills up. | ||

+ | * '''Heap''' - operations are O(log(n)), used when need to have access to the largest or smallest item. | ||

+ | |||

+ | =Asymptotic Analysis (Big O)= | ||

+ | Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU. | ||

+ | |||

+ | You should have memorized the two main definitions of big O, and the definitions of little o, big Omega, small omega, and big Theta. You should have memorized, can use, and can prove the following rules of thumb. | ||

+ | * '''constant factors''' - in any formula, constant multiples can be removed (e.g., 3*n<sup>2</sup> is Theta(n<sup>2</sup>)). | ||

+ | * '''polynomials''' - for any formula that has only polynomial terms, the formula is O(term with the largest exponent). | ||

+ | * '''logarithms < polynomials < exponentials''' - any formula involving fixed powers of polynomials is O(any fixed positive polynomial powers), e.g., (log(n))<sup>10</sup> is O(n<sup>0.01</sup>). Note that in practice you might likely prefer the latter, but in terms of big O the former is smaller ''eventually, for large enough n''. Similarly, any fixed power of a polynomial is O(any fixed positive power of an exponential). | ||

+ | |||

+ | You should be able to take a list of formulas and put them into "big O order" using the above rules of thumb. | ||

+ | |||

+ | =Algorithms Basics= | ||

+ | Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU. | ||

+ | |||

+ | ==Searching== | ||

+ | * '''Linear Search''' - Worst Case O(n). Used to search an unsorted array and linked list. | ||

+ | * '''Binary Search''' - Worst Case O(log n). Used to search a sorted array. | ||

+ | |||

+ | ==Sorting== | ||

+ | Sorting algorithms can be stable or unstable. A stable sorting algorithm will keep the original order of elements that are similar. An unstable algorithm does not guarantee that similar elements will keep their original order. | ||

+ | * '''Bubble Sort''' - Worst Case O(n^2). Simple sorting algorithm which 'bubbles' the larger/smaller numbers to the end of the array. Bubble sort is a stable sorting algorithm. | ||

+ | * '''Merge Sort''' - Worst Case O(n*log(n). Sorting algorithm which recursively splits the array in half, then reorders each section of the array as it puts the pieces back together. Merge sort is a stable sorting algorithm. | ||

+ | |||

+ | =Graph Algorithms= | ||

+ | * Breath First Search (BFS): (placeholder) | ||

+ | * Depth First Search (DFS): (placeholder) | ||

+ | * Prim's Minimum Spanning Tree Algorithm: (placeholder) | ||

+ | * Kruskal's Minimum Spanning Forest Algorithm: (placeholder) | ||

+ | * Dijkstra's Shortest Path Algorithm: (placeholder) | ||

+ | |||

+ | =NP Completeness= | ||

+ | |||

+ | =Algorithm technique - dynamic programming= | ||

+ | |||

+ | =Algorithm technique - greedy= | ||

+ | |||

+ | =Algorithm technique - divide and conquer= | ||

+ | |||

+ | =Algorithm technique - linear programming= |

## Latest revision as of 15:22, 17 May 2021

*This page is part of Programming and CS - Getting Started*

## Contents

# Reading

- TutorialsPoint - start with this, it covers the basic data structures and algorithms (material typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, CS 500 Programming Fundamentals at ISU).
- Introduction to Algorithms - or start with this. This includes both basic data structures and algorithms and further material that is typically covered in CS 458 Algorithms and CS 420 Theory of Computation at ISU.
- Algorithm Design - these are slides to accompany the Algorithm Design textbook. This textbook leans more towards material that is typically covered in CS 458 Algorithms and CS 658 Algorithms II at ISU, and is a fairly modern treatment of algorithms.

# Data Structures Basics

For each of the following basic data structures, you should have memorized the algorithms for insert, lookup, delete, the worst-case / average-case / best-case running times (including why the running time is what it is), uses of each data structure, and be able to use the data structure in your programs.

**Array (unsorted)**- O(1) insert, O(n) lookup using linear search, O(1) delete if you have an index to what is being deleted.**Array (sorted)**- O(n) insert, O(log(n)) lookup using binary search, O(n) delete because need to shift to keep in sorted order.**Linked list**- O(1) insert, O(n) lookup using linear search, O(1) delete.**Arrays / linked lists**- used when data is small enough that O(n) running time is acceptable. Code is very simple.**Stack**- stored using a linked list or array, one use is in keeping track of local variables in a running program, another is arithmetic using RPN.**Queue**- stored using a linked list, used when there is a "line" waiting for some resource where the first in the line gets the resource first.**Binary search tree (not necessarily balanced)**- operations are O(height of tree), which can be n in the worst-case of a completely unbalanced tree but is typically O(log(n)) on random data. Code for operations is very simple. BSTs used when data is large enough that array / linked list too slow, preferred over hash table if you might need to consider a range of values (e.g., all people with last name starting with Joh).**Binary search tree (balanced)**- operations are O(height of tree), tree height is kept O(log(n)) by rebalancing/shifting the tree if it becomes too unbalanced.**Hash table (deal with collisions using chaining/buckets)**- operations are O(size of linked list buckets), size of buckets is O(1) if size of hash table chosen appropriately and data is random, operations are n in worst case (all items collide and stay in one bucket). Hash table used when lookup/insert/delete need to be fast and having data stored in sorted order does not matter.**Hash table (deal with collisions using probing)**- performance similar to hash table with chaining/buckets but could be worse as the hash table fills up.**Heap**- operations are O(log(n)), used when need to have access to the largest or smallest item.

# Asymptotic Analysis (Big O)

You should have memorized the two main definitions of big O, and the definitions of little o, big Omega, small omega, and big Theta. You should have memorized, can use, and can prove the following rules of thumb.

**constant factors**- in any formula, constant multiples can be removed (e.g., 3*n^{2}is Theta(n^{2})).**polynomials**- for any formula that has only polynomial terms, the formula is O(term with the largest exponent).**logarithms < polynomials < exponentials**- any formula involving fixed powers of polynomials is O(any fixed positive polynomial powers), e.g., (log(n))^{10}is O(n^{0.01}). Note that in practice you might likely prefer the latter, but in terms of big O the former is smaller*eventually, for large enough n*. Similarly, any fixed power of a polynomial is O(any fixed positive power of an exponential).

You should be able to take a list of formulas and put them into "big O order" using the above rules of thumb.

# Algorithms Basics

## Searching

**Linear Search**- Worst Case O(n). Used to search an unsorted array and linked list.**Binary Search**- Worst Case O(log n). Used to search a sorted array.

## Sorting

Sorting algorithms can be stable or unstable. A stable sorting algorithm will keep the original order of elements that are similar. An unstable algorithm does not guarantee that similar elements will keep their original order.

**Bubble Sort**- Worst Case O(n^2). Simple sorting algorithm which 'bubbles' the larger/smaller numbers to the end of the array. Bubble sort is a stable sorting algorithm.**Merge Sort**- Worst Case O(n*log(n). Sorting algorithm which recursively splits the array in half, then reorders each section of the array as it puts the pieces back together. Merge sort is a stable sorting algorithm.

# Graph Algorithms

- Breath First Search (BFS): (placeholder)
- Depth First Search (DFS): (placeholder)
- Prim's Minimum Spanning Tree Algorithm: (placeholder)
- Kruskal's Minimum Spanning Forest Algorithm: (placeholder)
- Dijkstra's Shortest Path Algorithm: (placeholder)