This lesson will help you get set up to write and run Python scripts on your local computer by….
- Installing (Anaconda) Python with hundreds of modules already installed
- Installing the PyCharm (for writing, error-checking, debugging, and running Python all in one program)
- Configure PyCharm to work with your version of Python
Python is the programming language of choice among Digital Humanists, perhaps only rivaled by the statistical language R. I can say that Python is the most straightforward language I have ever learned (not that programming is easy!). Python is also the cleanest, most productive, and one of the most powerful languages I have encountered. Tasks that would have taken days to program in C++ or PHP can be accomplished in mere hours. Better yet, Codecademy (and other sites) have many amazing in-browser tutorials for free. Their Python tutorials are aimed at beginners with zero programming experience and will take you through to some very advanced features of Python. Along the way, you also learn quite a bit about general concepts applicable to any programming language. Much like human linguistics, learning an language makes it even easier to pick up more languages in the future. Their lessons are well paced, giving plenty of exercises and review to make sure concepts hit home before moving on.
There are some challenges, however.
- Codecademy teaches Python 2.7, yet current version is 3.5 (as of the time of this writing)
- A significant gap exists between learning concepts in-browser and actually running scripts on your computer
- Learners are unsure which program they should use to write Python
- Default Python install lacks many critical modules which have to be installed
- Some modules that are ideal for DH (such as the natural language toolkit) are difficult to install
To solve these issues we are not going to install the default Python distribution, but rather Anaconda’s distribution of Python. It is the same, except that it includes hundreds of extremely useful modules already installed. Scholars and students will find that this distribution of Python and PyCharm installed will meet more of their needs.
For my students, their first obstacle occurred when they tried to shift from learning how Python worked to trying to write scripts for their own purposes. The first potential issue is that the Codecademy tutorials are written in Python 2.7. However, the current Python version is 3.5.2 (as of the time of this writing). Scripts written for one version rarely work in the other version. However, the differences are largely superficial. The print command is a good example.
Python 2 is still supported, and a large number of DH tutorials were written with Python 2 in mind (such as The Programming Historian). Looking to the future, new and more powerful modules are increasingly being written in version 3. Tutorials on this site will be written in Python 3. If you do wish to use version 2, this tutorial still applies to you, and most other tutorials can still be completed with minor differences. Consulting the documentation on python.org can help you work it out.
Beyond this, although you can program in the Python IDLE, in Sublime Text, or even in Notepad. While I use Sublime for other languages, I prefer to write Python in what is known as an IDE (Integrated Development Environment). My favorite is PyCharm, which is a light program that is fantastic to accomplish nearly all your needs in Python. I often use PyCharm just to run scripts.
When you install Python 2 or 3 from python.org, it comes with a certain number of useful modules. But, you will find that Python is infinitely more powerful when you make use of the many open-source modules available. To make use of them one first has to install ‘easy_install’ and ‘pip’ first. Once installed, these make it much easier to quickly and automatically install further modules on your computer. While not overly difficult, installing these utilities and then downloading a large number of modules is yet another step where students can encounter problems. Beyond that, there are a number of modules that can be useful for DH methods (such as the natural language toolkit or the lxml parser), but often fail when you try to install them. These often require complicated workarounds, even if you are familiar with Python.[/fruitful_tab] [fruitful_tab title=”Step 1: Downloading and Installing Anaconda Python and PyCharm”]
Go to Anaconda’s download page, (fig. 1) choose which operating system you want (scroll down for OSX and Linux), and then download the 64-bit Graphical Installer for Python 3.5 (it’s on the right, or, if you want Python 2.7 choose the one on the left). Locate your download and then run the installer (fig. 2).
On OSX I seemed to run into a strange feature where it seemed to tell me that it could not install to the default location (Install for Me Only). But, all I had to do was click the ‘Me Only’ choice again and it was ready to install (figs. 3&4).
I recommend keeping the default installation path. However, Anaconda does take up sizable space (300+ megabytes on PC, 800+ on OSX). If you install to another location, just make sure to note where.
Once that is finished, Head to the PyCharm download page and download the Community Edition (on the right). (fig. 5) As before, find the download and run the installer. You can keep the default installation path or not (its not as important as with Anaconda). Now that it is installed, we need to link it with the Python interpreter we just installed.[/fruitful_tab] [fruitful_tab title=”Step 2: Configuring PyCharm for Acaconda”]
Now that PyCharm is ready, start it up and you should be taken to a loading screen (fig. 6). At the bottom right you will see two buttons, click the one on the left that says ‘Configure’ and then click ‘Settings’ on the context menu which pops up. Once the settings menu appears, click the expand arrow next to ‘Default Project.’ Underneath, you should find an item titled ‘Project Interpreter,’ (figs. 7 & 8) click that now. You should see at least one choice (the one whose path matches to location where you installed Anaconda. Click it to set the default project interpreter. Make SURE to hit the ‘Apply’ button on the bottom right to confirm your changes (fig. 9).
One you set the interpreter, you should see a large list of items populate the space below. These are all the modules that come already installed with Anaconda. As you can see, the list is quite extensive. This means that you will likely not need to download any (or at least many) modules for quite some time, at which point you will be more familiar with Python and PyCharm. If you ever do want to install a package manually, you can head into the PyCharm’s terminal panel and type “pip install INSERT_PACKAGE_NAME” (without quotes).[/fruitful_tab] [fruitful_tab title=”Step 3: Configuring Your First Project”]
One finished, you should be taken back to the startup screen. Now, click ‘Create New Project.’ Now, you should see a screen asking you to set the location of your project (it will get its own folder), as well as the Python interpreter (fig. 10). The interpreter should already be set, as we just configured the default settings. If you wanted to for any reason, you could change the interpreter here. Choose the location where you would like your project (I have a Python folder where I keep all the sub-folders for each project), then click ‘Create.’
Now you should see a blank project screen appear (I called my project ‘delete’) (fig. 11). Go to the ‘File’ menu and then choose ‘New’ (do NOT choose ‘New Project’). You should see several choices pop up in the left side pane. Choose ‘Python file’ (fig. 12). You will then see a small popup asking you to name your file. Name it whatever you wish and hit ‘OK’ (fig. 13).
We are almost there. We have set the default interpreter, created the project and the first script. We only need to make one last adjustment to configure this individual project (you must do this last step for every project you create). Many Python program use multiple scripts (.py files). But, we have to designate the ‘root’ script (the main script which loads the rest). Because you can have multiple scripts in your project, and have several open at once, PyCharm wants to make sure it knows the main script. Click the ‘Run’ menu up top (fig. 14), and then choose ‘Edit Configurations’ to adjust the individual project settings.
In this new window, you need to adjust two things. The first is that you need to create a new configuration. This will tell PyCharm how to put your script into action. Click the green ‘+’ symbol at the top left, and then choose Python (fig. 15). Before, you set the name of the folder in which this project is located as well as the name of the individual script itself. But, here you set the name of the entire project. Then, click the button to the right of the script field (fig. 16). This is the step where you set the main script which will execute whenever you hit the ‘run’ button. One you hit that button, you should see a file navigation window open. Go to the location where you script it (most likely, the folder in which is sits came up by default). Click your script and hit okay. Now back in the project configuration menu, click apply, then click okay again.
That’s it! You ready to code (if you need lessons on Python itself, try Codecademy’s Python course). Why don’t we give it a short test to make sure it worked?[/fruitful_tab] [fruitful_tab title=”Step 4: Testing it Out”]
Click into the blank script on the right and type the following line of code (it is a famous bit of code going back to at least the 1970’s that is most programmer’s first program).
print ("Hello, World!")
One of the reasons PyCharm (and other developer tools) is great is the way it helps you catch errors. In the next image (no need to do this), I have made an error by typing “primt” instead of “print” (fig. 18). Notice that PyCharm underlines the mistake in red, much like Microsoft Word points out spelling errors. This is a handy way to check for syntax (and other) errors that would normally take a long time to find.
Now, hit the ‘Run’ button (fig. 17). If you wrote the code correctly, you should see the output of your program display in a new pane that popped up below (fig. 19).
Congratulations, you have set up both Python Anaconda and the PyCharm tool to use it. Now you are ready to write your own scripts on your own machine. If you need lessons on Python, make sure (again) to check out Codecademy’s course (I swear they don’t pay me). If you want specific tutorials on how to apply Python to historical research, check out programminghistorian.org for fantastic examples and lessons.
Good luck and happy coding![/fruitful_tab] [/fruitful_tabs]