
Python is a versatile, general-purpose programming language increasingly used in data science and machine learning. Its clear and readable syntax makes Python particularly beginner-friendly. Thanks to a vast ecosystem of libraries, Python covers a wide range of applications – from data analysis and visualization to statistical modelling and deep learning.
In our courses, Python is mainly used in Data Science 3 (DS3) for machine learning.
Installation
For working with Python in data science, we recommend installing one of the following distributions:
- Miniforge – Lightweight, open-source distribution with conda-forge as the default channel. Recommended for experienced users.
- Anaconda – Comprehensive distribution with many pre-installed data science packages. Ideal for beginners.
Both distributions include the conda package manager, which simplifies the installation and management of packages and environments.
We recommend Miniforge for a lean installation where you only install the packages you actually need. Alternatively, Anaconda provides a convenient solution with many pre-installed packages.
After installing Miniforge or Anaconda, you can install Python libraries with conda:
conda install pandas numpy matplotlib seaborn scikit-learn scipy statsmodelsThe Graphical User Interface (GUI)
Python itself does not have a graphical user interface – it is typically used via the command line or within an integrated development environment (IDE). For working with Python, we recommend one of the following IDEs:
- Positron – New IDE from Posit with native support for R and Python
- Visual Studio Code – Versatile code editor with Python extension
- JupyterLab – Interactive notebook environment, especially popular in data science
JupyterLab is already included in Anaconda and can also be installed via Miniforge: conda install jupyterlab
Key Libraries
| Library | Description |
|---|---|
pandas |
Data manipulation and analysis |
numpy |
Numerical computations and array operations |
matplotlib |
Basic data visualization |
seaborn |
Statistical visualization (built on matplotlib) |
scikit-learn |
Machine learning (classification, regression, clustering) |
scipy |
Scientific computing and statistical tests |
statsmodels |
Statistical models and econometric analysis |
Further Resources
- Python.org – Official Python website
- Python Documentation – Official documentation
- Anaconda – Python distribution for data science
- Miniforge – Lightweight conda-forge distribution
- Real Python – Tutorials and articles