GuideNovember 17, 20266 min read

The Data Scientist's Dev Environment: Jupyter, Conda, and Config Sync

Jupyter configs, conda environments, matplotlib defaults, and API keys for data providers. Sync your entire data science setup with one command.

Data Science Environments Are Fragile

Data science work is uniquely dependent on environment consistency. A different version of pandas can produce different aggregation results. A missing Jupyter extension means your visualization workflow breaks. A conda environment that took an hour to resolve is gone when you switch machines. And the API keys for your data providers are scattered across notebook cells and shell history.

Most data scientists work across at least two machines: a laptop for exploratory work and a more powerful desktop or cloud instance for training. Keeping these environments in sync manually is a recipe for wasted time and irreproducible results.

ConfigSync bridges this gap by syncing your Jupyter configuration, conda environment specifications, plotting defaults, and encrypted API keys across every machine you use.

Jupyter Configuration and Extensions

Jupyter's configuration controls everything from default kernel behavior to security settings and extension state. These files live in ~/.jupyter/ and are almost never backed up.

Track Jupyter configuration
# Track Jupyter notebook config $ configsync add config ~/.jupyter/jupyter_notebook_config.py $ configsync add config ~/.jupyter/jupyter_lab_config.py # Track JupyterLab settings and extensions $ configsync add config ~/.jupyter/lab/user-settings/ # Track IPython profile $ configsync add config ~/.ipython/profile_default/ipython_config.py $ configsync add config ~/.ipython/profile_default/startup/

Your IPython startup directory is particularly valuable. If you have auto-imports (numpy as np, pandas as pd) or custom magic commands, those live in startup scripts that execute every time you open a notebook.

Conda Environment Exports

Conda environments take time to build. Dependency resolution alone can take minutes for complex environments. Rather than rebuilding from scratch, export your environment and let ConfigSync sync the specification.

Export and track conda environments
# Export your current conda environment $ conda env export > ~/environment.yml # Track it with ConfigSync $ configsync add config ~/environment.yml # For multiple environments, export each one: $ conda env export -n ml-training > ~/conda-envs/ml-training.yml $ conda env export -n data-analysis > ~/conda-envs/data-analysis.yml $ configsync add config ~/conda-envs/
Use a post-pull hook to automatically recreate your conda environment when you pull on a new machine: configsync config set hooks.post-pull "conda env update --file ~/environment.yml --prune"

Matplotlib, Seaborn, and Plotting Defaults

Custom plotting styles are the kind of configuration that takes hours to perfect and seconds to lose. Matplotlib reads from a matplotlibrc file, and Seaborn builds on top of it. If you have a custom style, track it.

Track plotting configuration
# Track matplotlib config $ configsync add config ~/.config/matplotlib/matplotlibrc $ configsync add config ~/.config/matplotlib/stylelib/ # Track any custom plotting scripts $ configsync add config ~/plotting-defaults.py

If your team has a standard plotting style for publications or presentations, this is an excellent candidate for shared team configs. Everyone gets the same chart aesthetics without copying files around.

API Keys for Data Providers

Data scientists work with APIs from providers like Kaggle, Hugging Face, OpenAI, Snowflake, and various data brokers. These keys should never live in notebook cells or plain text files.

Store data API keys as secrets
# Store API keys securely $ configsync secret set KAGGLE_KEY $ configsync secret set HF_TOKEN $ configsync secret set OPENAI_API_KEY $ configsync secret set SNOWFLAKE_PASSWORD # Track the Kaggle config file (contains username + key) $ configsync add config ~/.kaggle/kaggle.json --encrypt # Track Hugging Face CLI config $ configsync add config ~/.cache/huggingface/token --encrypt

By storing these as ConfigSync secrets, they are encrypted at rest and available on any machine where you pull your environment. No more searching Slack history for that API key your colleague shared six months ago.

Pip and Package Manager Configuration

Beyond conda, you likely have pip configuration, Poetry settings, or other Python tool configs that affect how packages are installed and resolved.

Config FilePurposeTrack With
~/.pip/pip.confCustom index URLs, trusted hostsconfigsync add config
~/environment.ymlConda env specificationconfigsync add config
~/.jupyter/Notebook and Lab settingsconfigsync add config
~/.kaggle/kaggle.jsonKaggle API credentialsconfigsync add config --encrypt
~/.config/matplotlib/Plotting defaults and stylesconfigsync add config

The Post-Pull Hook: Automated Environment Setup

The real power for data scientists is the post-pull hook. When you pull your ConfigSync state on a new machine, you want conda environments to rebuild automatically.

Automate environment restoration
# Set up the post-pull hook $ configsync config set hooks.post-pull "bash ~/.configsync/ds-restore.sh" # ~/.configsync/ds-restore.sh #!/bin/bash echo "Restoring data science environment..." # Update conda env from exported spec conda env update --file ~/environment.yml --prune # Install JupyterLab extensions jupyter labextension install @jupyterlab/toc 2>/dev/null # Verify key packages python -c "import pandas, numpy, sklearn; print('Core packages OK')" echo "Data science environment restored."

With this setup, configsync pull on a new machine restores your Jupyter config, recreates your conda environments, installs your extensions, and verifies that core packages are available. Your entire data science workflow is one command away from any machine.

Ready to try ConfigSync?

Sync your entire dev environment across machines in minutes. Free forever for up to 3 devices.