Python Programming - Getting Started and Using Linux - Large Text Files: Difference between pages

From Computer Science at Indiana State University
(Difference between pages)
Jump to navigation Jump to search
m 1 revision imported
 
m 1 revision imported
 
Line 1: Line 1:
''This page is part of [[Programming and CS - Getting Started]]''
''This page is a part of the [[Linux and CS Systems - Getting Started]].  This page assumes you have your computer setup to connect to the CS server, or have the appropriate software installed on your computer to run commands.  Go back to the Linux and CS Systems Getting Started main page if you don't have our system setup yet.''


For a video explaining how to get started here, see https://www.youtube.com/watch?v=uLnhcCZS4-Y&t=424s
=Large Text File=
On this page we walk you through looking at a text file that contains the complete works of Shakespeare (courtesy of [https://www.gutenberg.org Project Gutenberg]).  The file was downloaded from [http://www.gutenberg.org/files/100/].  How many lines and words are there in this file? The file is a bit large to open with Word (you can try, it takes a while for it to actually load).  So instead of opening the file in Word, we can use some of our Linux commands to get some information about the file.  Check back at the Linux and CS Systems Getting Started page for more commands that might be useful.


==Getting Started==
The sample session here shows how you can copy the file into your account on the CS server and check how many lines and words are in the fileIf you would like to follow along and run these commands, first login to the system and open up the terminal.  
# '''Reading''' - start reading through at least one of the following before you start working on the programming problems.
## [https://www.w3schools.com/python/python_intro.asp w3schools Python] - interactive tutorial where you can try out code in the browser
## [https://www.learnpython.org/ Learn Python] - interactive tutorial where you can try out code in the browser
## [https://automatetheboringstuff.com/ Automate the Boring Stuff with Python] - suitable for people with very limited programming experience, this is the text that is being followed for our CS 151 course (as of 2019)
## [http://greenteapress.com/thinkpython2/html/index.html Think Python] - suitable for people with very limited programming experience, not very deep
## [https://learnxinyminutes.com/docs/python3/ LearnXinYminutes] - quick review once you are familiar with the basics
## [https://docs.python.org/3/tutorial/ Python.org tutorial] - good for those with programming experience already
# '''Get Python installed on your computer''' - download the latest Python3 version at https://www.python.org/downloads/Having python3 installed on your computer allows you to debug much quicker.
# '''Work on solving problems''' that are listed below.
# If you are a current or incoming ISU student, or an ISU alumni, get help on what you are working on using the online unix lab.  When asking about the hackerrank problems make sure to refer to them using the title hackerrank gives them, or give a link to the problem statement on hackerrank.
# '''Cheat sheets''' - keep a cheat sheet for yourself of python syntax, built-in functions, etc.  See below on this page for our model cheat sheets.


==Running Python==
First, we create a directory in our home directory for keeping the shakespeare file.
''See [https://youtu.be/ABmQ2QyYCLg this video] for a video demo of getting started running Python on the CS server and in IDLE.''
 
Once you have a computer with Python installed you are ready to try it out!  You can use the CS systems (computers in our classrooms and labs, or connect to the CS server - see [[Linux - System Setup]]), or install Python on your personal computer.  ''We recommend doing both.''
 
===Run in the Terminal===
On the CS server, other Linux systems, and macOS, you can run python using your terminal.  If you run the command <code>python3</code> then you will see the python console.  Here you can type one python line at a time to run, and the results are displayed. You can try these -
<pre>
<pre>
2 + 3
cs299@cs:~> cd ~
3 * 4
cs299@cs:~> mkdir shakespeare
10 ** 3
cs299@cs:~> cd shakespeare
</pre>
</pre>
 
Next we copy the text file from where it is stored on the CS server and use the wc command to see how many lines, words, and characters (bytes) are in the file.
We normally store our python code in text files and want to run the entire file.  If you have a file with python code (let's say it is named myProgram.py) then you change directories to the directory that has your file, and then type <code>python3 myProgram.py</code>.  You can have one terminal open where you are running your program with this command, and you can have another terminal open to edit the file (or use some other text editor).
 
===Run in IDLE===
Python comes with a graphical front-end called IDLE.  You can run this program if you are in the CS labs by opening your terminal and then typing <code>idle3</code>.  If you have python installed on your personal computer, then you can find IDLE in your list of programs (Start button on Windows, Finder then click Applications on macOS) or run idle3 from the terminal (if you are using Linux).  If you are connecting to the CS server with a terminal program from home, you cannot run idle3 in the terminal because it is a graphical program - you will need to install it on your home computer to use it from home. 
 
When you start IDLE, it shows you the Python3 console, where you can type your Python3 commands.  Try typing in arithmetic expressions to make sure it is working. For example -
<pre>
<pre>
2 + 3
cs299@cs:~/shakespeare> cp /u1/junk/shakespeare.txt .
3 * 4
cs299@cs:~/shakespeare> ls
10 ** 3
shakespeare.txt
cs299@cs:~/shakespeare> wc shakespeare.txt
124787  904061 5589890 shakespeare.txt
</pre>
</pre>
 
You can also use the nano text editor (or whatever text editor you like) to look through the text file.
===First Python Program===
We store python programs in text files.  You can use the following as your first python program.
<pre>
<pre>
print('Hello world.')
cs299@cs:~/shakespeare> nano shakespeare.txt
print('Hello again.')
print('Goodbye now.')
</pre>
</pre>
If you are using IDLE you create and run a file by -
You can use the head and tail commands to look at the start and end of a text file, and grep to look for particular lines in the file.
# Click on File and then New File. 
<pre>
# Copy/paste or type your code in the file window that comes up.
cs299@cs:~/shakespeare> head shakespeare.txt
# Click File and Save As.  Name the file myProgram.py and save it in whichever directory you want to keep your python programs. 
The Project Gutenberg EBook of The Complete Works of William Shakespeare, by
# Click Run and then Run Module.  In the python console window you should see that your program ran.
William Shakespeare
If you are connecting to the CS server with your terminal from your home computer, you can choose one of the following options.
* Create and edit the myProgram.py file with a text editor that works in the terminal - see [[Linux Terminal - Text Editors]]. Save the program and run the program with <code>python3 myProgram.py</code>. If you make changes, edit the program, and then run it again.
* Edit the program using a text editor on your home computer (for example, Kate or Atom, see [[Linux Terminal - Text Editors]]), transfer to the CS server using a file transfer program (see [[Linux - System Setup]]), change directories in your terminal to the directory that has the program, and then run the program in the terminal with <code>python3 myProgram.py</code>


===Python2 versus Python3===
This eBook is for the use of anyone anywhere at no cost and with
If you want to use Python version 2 instead, you would use commands <code>python</code> and <code>idle</code> rather than <code>python3</code> and <code>idle3</code>.  You should normally use Python3; the only reason to use Python2 is if you are using a package that is only available on Python2 (Python2 is not officially supported anymore, so this should be fairly rare).
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org


==Installing Python on Your Personal Computer==
** This is a COPYRIGHTED Project Gutenberg eBook, Details Below **
'''Windows and macOS''' Download and install from https://www.python.org/downloads/. 
**    Please follow the copyright guidelines in this file.     **
 
cs299@cs:~/shakespeare> head -n 1 shakespeare.txt
'''Linux''' On Linux python is probably already installed. If not do an internet search for your Linux distribution and "install python3" (i.e., search for - ubuntu install python).
The Project Gutenberg EBook of The Complete Works of William Shakespeare, by
 
cs299@cs:~/shakespeare> tail -n 1 shakespeare.txt
'''Chromebook''' You should first get Linux (beta) installed (see [[https://cs.indstate.edu/wiki/index.php/Linux_-_System_Setup#Chromebook]]) and then you can follow instructions for installing python3 for ubuntu.  If python3 is not already installed, then running <code>sudo apt install python3</code> should install it.
*** END: FULL LICENSE ***
 
cs299@cs:~/shakespeare> grep "Copyright" shakespeare.txt
==Lists of Problems==
what you can do with this work.  Copyright laws in most countries are in
===Programming Assignments 1===
cs299@cs:~/shakespeare> grep -i -m 3 "Copyright" shakespeare.txt
[[Programming Assignments - Beginning 1]] - start with trying to solve these problems. Each requires a different feature of the Python programming language, so solve these problems as you read through one of the tutorials or links above. Note that the page includes a link to repl.it that contains solutions to most of the problems.  If you do not have Python installed on your computer, you can try it out at [https://repl.it/ repl.it] - click the logo at the top, then click "+ new repl", select Python and Create Repl.
** This is a COPYRIGHTED Project Gutenberg eBook, Details Below **
 
**     Please follow the copyright guidelines in this file.     **
[[Programming Assignments - Beginning 2]] - another set of classic beginning programming exercises.  Some of these will be more involved.
*This Etext has certain copyright implications you should read!*
 
===HackerRank Problems===
Once you can do some basic Python programming it is time to have some of your work checkedHackerrank is a site where you can create an account and work on problems that will be checked if they are correct. Note that hackerrank has very strict rules for accepting correct solutions.  Start with the basic problems to get a feel for what hackerrank expects.
 
[[Python programming - hackerrank problems]]
 
===More Practice===
If you are able to do most of the problems on the pages linked above, then you don't need us to give you lists of problems any more.  You can pick problems to work through on your own.  Some suggested places with problems are as follows.
* [https://www.hackerrank.com/domains/algorithms Hackerrank - algorithms]
* [https://www.hackerrank.com/domains/data-structures Hackerrank - data structures]
* [http://cs.indstate.edu/acm/contests.html ACM Contest Problems]
 
==Cheat Sheets and Quizzes==
These cheat sheets have the highlights of some of the basic information to memorize when you are in your first year of Python programming.  Note - each cheat sheet contains a sample quiz that might be used by your instructor.
* [[Cheat sheet - Python Operators, Expressions]]
* [[Cheat sheet - Python Keywords, Concepts, Functions]]
 
==Programming Errors and How to Fix Them==
* '''Syntax error''' - program is not a valid program, and normally python gives you an error message giving some indication of what the problem was. 
** ''Debugging'' - read the error message!  Normally you are given a line number where python got "confused" - sometimes that is where the problem is, but you may need to look just before that line (or just after).  Common syntax errors in python...
** ''Mismatched'' () or [] or "" or '', for example <code>x = 2 * (3 + 4</code>
** ''Mis-capitalization or mis-spelling'', for example <code>Print('hello')</code>
** ''Improper indenting'', for example
** ''Data type'' problems - mixing up strings and numbers, for example <code>x = '3' + 4</code>
<pre>
for i in range(0, 10):
print(i)
</pre>
* '''Runtime error''' - program is a valid program but when you run it something goes wrong.  Debugging is similar as for syntax errors and logical errors.  For example...
<pre>
x = 1
print(x)
x = y + z
</pre>
* '''Logical error''' - program runs but does not do the right thing.
** ''Print debugging'' - One debugging method is to start printing things (values of variables) in your program starting at the beginning, run the program, make sure it is printing the values you expect, at some point you see a value you don't expect, and that tells you where to look in your code.  You often are 100% sure what you have is right, but in the end there is some problem.  So you have to ''check everything'', start at the very beginning of the program. 
** ''Disable all'' - Another strategy is to disable almost all parts of your program (e.g., put it all in comments) until you get behavior that makes sense, and then start adding things back in (uncommenting), and eventually you'll add something in that doesn't behave the way you thought, and now you know where the problem is.  Examples...
<pre>
x = 1
y = 2
z = 3
avg = x + y + z / 3
print(avg)
</pre>
</pre>

Latest revision as of 13:22, 17 August 2025

This page is a part of the Linux and CS Systems - Getting Started. This page assumes you have your computer setup to connect to the CS server, or have the appropriate software installed on your computer to run commands. Go back to the Linux and CS Systems Getting Started main page if you don't have our system setup yet.

Large Text File

On this page we walk you through looking at a text file that contains the complete works of Shakespeare (courtesy of Project Gutenberg). The file was downloaded from [1]. How many lines and words are there in this file? The file is a bit large to open with Word (you can try, it takes a while for it to actually load). So instead of opening the file in Word, we can use some of our Linux commands to get some information about the file. Check back at the Linux and CS Systems Getting Started page for more commands that might be useful.

The sample session here shows how you can copy the file into your account on the CS server and check how many lines and words are in the file. If you would like to follow along and run these commands, first login to the system and open up the terminal.

First, we create a directory in our home directory for keeping the shakespeare file.

cs299@cs:~> cd ~
cs299@cs:~> mkdir shakespeare
cs299@cs:~> cd shakespeare

Next we copy the text file from where it is stored on the CS server and use the wc command to see how many lines, words, and characters (bytes) are in the file.

cs299@cs:~/shakespeare> cp /u1/junk/shakespeare.txt .
cs299@cs:~/shakespeare> ls
shakespeare.txt
cs299@cs:~/shakespeare> wc shakespeare.txt 
 124787  904061 5589890 shakespeare.txt

You can also use the nano text editor (or whatever text editor you like) to look through the text file.

cs299@cs:~/shakespeare> nano shakespeare.txt

You can use the head and tail commands to look at the start and end of a text file, and grep to look for particular lines in the file.

cs299@cs:~/shakespeare> head shakespeare.txt
The Project Gutenberg EBook of The Complete Works of William Shakespeare, by
William Shakespeare

This eBook is for the use of anyone anywhere at no cost and with
almost no restrictions whatsoever.  You may copy it, give it away or
re-use it under the terms of the Project Gutenberg License included
with this eBook or online at www.gutenberg.org

** This is a COPYRIGHTED Project Gutenberg eBook, Details Below **
**     Please follow the copyright guidelines in this file.     **
cs299@cs:~/shakespeare> head -n 1 shakespeare.txt
The Project Gutenberg EBook of The Complete Works of William Shakespeare, by
cs299@cs:~/shakespeare> tail -n 1 shakespeare.txt
*** END: FULL LICENSE ***
cs299@cs:~/shakespeare> grep "Copyright" shakespeare.txt
what you can do with this work.  Copyright laws in most countries are in
cs299@cs:~/shakespeare> grep -i -m 3 "Copyright" shakespeare.txt 
** This is a COPYRIGHTED Project Gutenberg eBook, Details Below **
**     Please follow the copyright guidelines in this file.     **
*This Etext has certain copyright implications you should read!*