Difference between revisions of "Using Linux - Large Text Files"

From Computer Science
Jump to: navigation, search
Line 1: Line 1:
 
''This page is a part of the [[Linux and CS Systems Bootcamp]].  This page assumes you have your computer setup to connect to the CS server, or have the appropriate software installed on your computer to run commands.  Go back to the Linux and CS Systems Bootcamp main page if you don't have our system setup yet.''
 
''This page is a part of the [[Linux and CS Systems Bootcamp]].  This page assumes you have your computer setup to connect to the CS server, or have the appropriate software installed on your computer to run commands.  Go back to the Linux and CS Systems Bootcamp main page if you don't have our system setup yet.''
  
In this section we walk you through looking at a text file that contains the complete works of Shakespeare (courtesy of [https://www.gutenberg.org Project Gutenberg]).   
+
=Large Text File=
 +
On this page we walk you through looking at a text file that contains the complete works of Shakespeare (courtesy of [https://www.gutenberg.org Project Gutenberg]).  The file was downloaded from [http://www.gutenberg.org/files/100/].  How many lines and words are there in this file?  The file is a bit large to open with Word (you can try, it takes a while for it to actually load).  So instead of opening the file in Word, we can use some of our Linux commands to get some information about the file.  Check back at the Linux and CS Systems Bootcamp page for more commands that might be useful.
  
First, login to the system and open up the terminal.  Next, make sure you are in your home directory and create a directory to use for this example.
+
The sample session here shows how you can copy the file into your account on the CS server and check how many lines and words are in the file.  You can also watch a [video demo].  If you would like to follow along and run these commands, first login to the system and open up the terminal.  
 
<pre>
 
<pre>
 
cs299@cs:~> cd ~
 
cs299@cs:~> cd ~

Revision as of 15:01, 13 August 2019

This page is a part of the Linux and CS Systems Bootcamp. This page assumes you have your computer setup to connect to the CS server, or have the appropriate software installed on your computer to run commands. Go back to the Linux and CS Systems Bootcamp main page if you don't have our system setup yet.

Large Text File

On this page we walk you through looking at a text file that contains the complete works of Shakespeare (courtesy of Project Gutenberg). The file was downloaded from [1]. How many lines and words are there in this file? The file is a bit large to open with Word (you can try, it takes a while for it to actually load). So instead of opening the file in Word, we can use some of our Linux commands to get some information about the file. Check back at the Linux and CS Systems Bootcamp page for more commands that might be useful.

The sample session here shows how you can copy the file into your account on the CS server and check how many lines and words are in the file. You can also watch a [video demo]. If you would like to follow along and run these commands, first login to the system and open up the terminal.

cs299@cs:~> cd ~
cs299@cs:~> mkdir shakespeare
cs299@cs:~> cd shakespeare

Copy the text file from where it is stored on the CS server and use the wc command to see how many lines, words, and characters (bytes) are in the file.

cs299@cs:~/shakespeare> cp /u1/junk/shakespeare.txt .
cs299@cs:~/shakespeare> ls
shakespeare.txt
cs299@cs:~/shakespeare> wc shakespeare.txt 
 124787  904061 5589890 shakespeare.txt

You can also use the nano text editor (or whatever text editor you like) to look through the text file.

nano shakespeare.txt