Install Your Tools

The Netflix Advanced Data Science Boot Camp will expose you to many tools. Some or all of them will be new to you. Believe it or not, you'll soon find these tools as familiar as email or an app on your phone.

We'll summarize each tool and then give you the opportunity to install the tools before the first class.

Ready for Action!

Make sure that you have the following items available before continuing the prework:

  • A Mac or Windows laptop with 8 GB RAM and 64-bit dual processor.

Note: You cannot use Linux in this course.

  • Webcam, microphone, and headphones.

  • (Recommended) An external monitor compatible with laptop or desktop.

  • High-speed internet connection (minimum speed requirements: download 25 Mbps, upload 5 Mbps).

Required Software

The following is a list of the tools you should install before the first day of class:

  • Microsoft Excel (Office 2012 version or later, Office 365 version acceptable) {purchase optional}
  • VS Code (latest version)
  • Git (latest version)
  • Git Bash (Windows only)
  • pgAdmin and Postgres
  • Python (version 3.8)
  • Anaconda (version 4.10.1 or later)

Mac and Windows specific instructions to install each software tool are linked at the bottom of this page for both operating systems.

If you encounter any issues during setup, please try and use tools such as Google and StackOverflow to troubleshoot. If problems still exist, reach out to the instructor or TAs for assistance. The more people who get these tools installed and configured before class, the more time we can spend on the class topics!


An Overview of Your Tools

Before installing your software tools, take a moment to examine each of them in more detail to better understand the role they will play in the course.

Microsoft Excel (purchase optional)

For many years Excel has been a go-to for visualizing and manipulating data at the highest level. Regardless of skill level and background, most data scientists still use Excel to present data to non-technical peers. For this course, we will only focus on Excel on the first day of class as a warm up. We'll cover some of the most commonly used functions for a data scientist.

Therefore, purchasing the Excel software is highly encouraged, but not mandatory. We will also provide instructions for downloading a free trial of Microsoft 365. You will not be charged anything as long as you cancel before the free trial period ends.

  • Excel is available on a subscription basis via Office 365 or as a one-time purchase. You can find details about both options at Microsoft Store. Remember, before completing your purchase, make sure that you've selected the correct operating system and version (Excel 2016 or later).

  • For those students who do not wish to purchase Excel, you can use the free version of Microsoft Office by signing up for a Microsoft account. Note that your interface and instructions may look slightly different from the installed versions of the software, but the concepts are still the same.

VS Code

Visual Studio (VS) Code is a free text editor that runs on Mac, Linux, and Windows operating systems. VS Code has become the text editor of choice for a substantial number of developers for good reason:

  • It has syntax highlighting, linting (think grammar check for programming), and spell-checking for almost any programming language

  • It has an integrated terminal and Jupyter Notebook IDE (integrated development environment) for convenient code running

  • It has a huge developer community for custom plugins, extensions and quality-of-life improvements

We highly encourage everyone to use VSCode for their programming throughout the course, but this software is not required.

VSCodeWebsite

Git and GitHub

Code files are largely collaborative, as developers are constantly building on each other's work. Git is a version-control system that offers a specialized set of strategies for orchestrating this collaboration, and GitHub stores these collaborative actions online. You can think of GitHub as a sort of Dropbox for coders. It offers a central place for developers to upload their code, view revision history, and make changes to a master set of files. You'll come to learn a lot about Git and GitHub in your first few weeks of class.

If you have not used Git or Github in the past, you should first signup for a GitHub username. Keep in mind that more and more companies in the tech and medical industries are asking for a link to your GitHub profile, so choose a username wisely!

pgAdmin and Postgres

PgAdmin is a web-based GUI tool used to interact with the Postgres database sessions locally and remotely. PgAdmin allows you to write and execute SQL queries. PostgreSQL, or Postgres, is a free and open-source relational database management system (RDMS) that implements SQL standards.

Python and Anaconda

Python and its ecosystem are extremely popular tools in the world of data science and engineering. This course will use an equally popular all-in-one solution called Anaconda, which handles our Python environment, library installation and library maintenance - everything we'll need for the Python units of the course.

You're Ready to Install!

Now it's time to collect your tools and begin. Setup guides for both Mac and Windows users are provided on the next pages. Follow the instructions closely and do your best with the information you have.