Algorithms and Data Structures - Getting Started

From Computer Science
Revision as of 14:49, 17 May 2021 by Znoble1 (talk | contribs) (Searching)
Jump to: navigation, search

This page is part of Programming and CS - Getting Started

Note - this page will be filled out during fall 2020 as the CS 202 and 458 courses run.

Reading

  • TutorialsPoint - start with this, it covers the basic data structures and algorithms (material typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, CS 500 Programming Fundamentals at ISU).
  • Introduction to Algorithms - or start with this. This includes both basic data structures and algorithms and further material that is typically covered in CS 458 Algorithms and CS 420 Theory of Computation at ISU.
  • Algorithm Design - these are slides to accompany the Algorithm Design textbook. This textbook leans more towards material that is typically covered in CS 458 Algorithms and CS 658 Algorithms II at ISU, and is a fairly modern treatment of algorithms.

Data Structures Basics

For each of the following basic data structures, you should have memorized the algorithms for insert, lookup, delete, the worst-case / average-case / best-case running times (including why the running time is what it is), uses of each data structure, and be able to use the data structure in your programs.

Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU.

  • Array (unsorted) - O(1) insert, O(n) lookup using linear search, O(1) delete if you have an index to what is being deleted.
  • Array (sorted) - O(n) insert, O(log(n)) lookup using binary search, O(n) delete because need to shift to keep in sorted order.
  • Linked list - O(1) insert, O(n) lookup using linear search, O(1) delete.
  • Arrays / linked lists - used when data is small enough that O(n) running time is acceptable. Code is very simple.
  • Stack - stored using a linked list or array, one use is in keeping track of local variables in a running program, another is arithmetic using RPN.
  • Queue - stored using a linked list, used when there is a "line" waiting for some resource where the first in the line gets the resource first.
  • Binary search tree (not necessarily balanced) - operations are O(height of tree), which can be n in the worst-case of a completely unbalanced tree but is typically O(log(n)) on random data. Code for operations is very simple. BSTs used when data is large enough that array / linked list too slow, preferred over hash table if you might need to consider a range of values (e.g., all people with last name starting with Joh).
  • Binary search tree (balanced) - operations are O(height of tree), tree height is kept O(log(n)) by rebalancing/shifting the tree if it becomes too unbalanced.
  • Hash table (deal with collisions using chaining/buckets) - operations are O(size of linked list buckets), size of buckets is O(1) if size of hash table chosen appropriately and data is random, operations are n in worst case (all items collide and stay in one bucket). Hash table used when lookup/insert/delete need to be fast and having data stored in sorted order does not matter.
  • Hash table (deal with collisions using probing) - performance similar to hash table with chaining/buckets but could be worse as the hash table fills up.
  • Heap - operations are O(log(n)), used when need to have access to the largest or smallest item.

Asymptotic Analysis (Big O)

Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU.

You should have memorized the two main definitions of big O, and the definitions of little o, big Omega, small omega, and big Theta. You should have memorized, can use, and can prove the following rules of thumb.

  • constant factors - in any formula, constant multiples can be removed (e.g., 3*n2 is Theta(n2)).
  • polynomials - for any formula that has only polynomial terms, the formula is O(term with the largest exponent).
  • logarithms < polynomials < exponentials - any formula involving fixed powers of polynomials is O(any fixed positive polynomial powers), e.g., (log(n))10 is O(n0.01). Note that in practice you might likely prefer the latter, but in terms of big O the former is smaller eventually, for large enough n. Similarly, any fixed power of a polynomial is O(any fixed positive power of an exponential).

You should be able to take a list of formulas and put them into "big O order" using the above rules of thumb.

Algorithms Basics

Note - this material is typically covered in CS 201 Programming Structures, CS 202 Data Structures and Algorithms, and CS 500 Programming Fundamentals at ISU.

Searching

  • Linear Search: Worst Case O(n). Used to search an unsorted array and linked list.
  • Binary Search: Worst Case O(log n). Used to search a sorted array.

Sorting

  • Bubble Sort: (placeholder)
  • Merge Sort: (placeholder)

Graph Algorithms

  • Breath First Search (BFS): (placeholder)
  • Depth First Search (DFS): (placeholder)
  • Prim's Minimum Spanning Tree Algorithm: (placeholder)
  • Kruskal's Minimum Spanning Forest Algorithm: (placeholder)
  • Dijkstra's Shortest Path Algorithm: (placeholder)

NP Completeness

Algorithm technique - dynamic programming

Algorithm technique - greedy

Algorithm technique - divide and conquer

Algorithm technique - linear programming