Conda & Miniconda Repositories

This section provides comprehensive documentation for using CloudRepo with Conda and Miniconda, the package and environment management systems widely used in data science, machine learning, and scientific computing.

Introduction to Conda

What is Conda?

Conda is a powerful, open-source package and environment management system that runs on Windows, macOS, and Linux. Originally developed for Python programs, Conda has evolved into a language-agnostic package manager that can install and manage packages from any language. Miniconda is a minimal installer that includes only Conda, Python, and their dependencies, providing a lightweight starting point for creating custom environments.

Conda’s Role in Scientific Python and Data Science

Conda has become the de facto standard in scientific computing and data science for several compelling reasons:

Binary package management: Unlike pip which primarily handles pure Python packages, Conda excels at managing complex binary dependencies including C/C++ libraries, CUDA toolkits, and system-level packages
Cross-language support: Seamlessly manages packages from Python, R, Julia, Scala, Java, JavaScript, C/C++, FORTRAN, and more
Environment isolation: Creates truly isolated environments with their own Python interpreters and system libraries
Reproducible science: Enables exact environment replication across different machines and operating systems
Optimized packages: Conda-forge and defaults channels provide pre-compiled, optimized binaries for scientific libraries like NumPy, SciPy, and TensorFlow
Dependency solver: Advanced SAT solver handles complex dependency graphs that often challenge pip

Why Teams Need Private Package Management

Data science and research teams increasingly require private package management for:

Proprietary algorithms: Protecting intellectual property in custom ML models and analysis tools
Internal libraries: Sharing data processing pipelines, feature engineering tools, and domain-specific utilities
Compliance and security: Maintaining control over package distribution in regulated industries (healthcare, finance, government)
Version control: Ensuring reproducibility of experiments and analyses across team members
Performance optimization: Distributing custom-compiled packages optimized for specific hardware
Data connectors: Securely sharing database connectors and API clients with embedded credentials

CloudRepo’s Conda Strategy

CloudRepo provides a comprehensive solution for Conda users through a two-phase approach:

Currently Available: PyPI Integration: CloudRepo fully supports Python packages that can be installed in Conda environments using pip. This covers the majority of custom Python packages that data science teams develop, including pure Python libraries, data processing tools, and ML model packages.
Coming Soon: Native Conda Channels: Native Conda channel support is on our near-term roadmap, which will enable hosting of conda packages with complex binary dependencies, multi-language packages, and custom-compiled scientific libraries. This positions CloudRepo as a complete alternative to Anaconda Enterprise and JFrog Artifactory for Conda users.

Prerequisites

Before configuring Conda to work with CloudRepo, ensure you have:

CloudRepo Account and Repository
- An active CloudRepo organization account
- A Python repository created in the CloudRepo Admin Portal
- A repository user with appropriate permissions (read for downloading, write for uploading)
Note

Your admin user cannot access repositories directly for security reasons. Always create dedicated repository users in the CloudRepo Admin Portal.

For setup instructions, see:
- Creating a Repository
- Creating Users
Conda or Miniconda Installation
- Conda (via Anaconda) or Miniconda installed on your system
- Verify installation: conda --version
Repository URL Format

Your CloudRepo PyPI repository URL follows this pattern:
```
https://[organization-id].mycloudrepo.io/repositories/[repository-id]/simple
```
You can find the exact URL in the CloudRepo Admin Portal under your repository’s settings.

Using CloudRepo with Conda for PyPI Packages

Configuring pip within Conda Environments

CloudRepo seamlessly integrates with Conda environments for Python packages. Here’s how to configure pip to use your private CloudRepo repository:

Method 1: Environment-Specific Configuration

Configure pip for a specific Conda environment

# Create and activate a new environment
conda create -n myproject python=3.11
conda activate myproject

# Configure pip to use CloudRepo
pip config set global.index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
pip config set global.extra-index-url https://pypi.org/simple

# Verify configuration
pip config list

Alternative: Using environment variables

# Set environment variables (add to .bashrc/.zshrc for persistence)
export PIP_INDEX_URL=https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
export PIP_EXTRA_INDEX_URL=https://pypi.org/simple

# Install packages
pip install your-private-package

Method 2: Per-Project Configuration with pip.conf

Create a pip.conf in your project directory

# .pip/pip.conf or pip.ini (Windows)
[global]
index-url = https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
extra-index-url = https://pypi.org/simple

[install]
trusted-host = [org-id].mycloudrepo.io

Mixing Conda-Forge and Private PyPI Packages

Data science projects often require packages from both conda-forge (for optimized binaries) and private PyPI repositories (for proprietary code). Here’s the recommended approach:

environment.yml - Mixed conda and pip dependencies

name: data-science-project
channels:
  - conda-forge
  - defaults
dependencies:
  # Conda packages for optimized binaries
  - python=3.11
  - numpy=1.24.*
  - pandas=2.0.*
  - scikit-learn=1.3.*
  - matplotlib=3.7.*
  - jupyter=1.0.*
  - pytorch=2.0.*
  - cudatoolkit=11.8  # For GPU support

  # Pip packages including private CloudRepo packages
  - pip
  - pip:
    # Private packages from CloudRepo
    - --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
    - --extra-index-url https://pypi.org/simple
    - your-private-ml-library==2.1.0
    - internal-data-connectors==1.5.3
    - proprietary-feature-engine==3.0.1

    # Public packages better installed via pip
    - transformers==4.30.0
    - langchain==0.0.200

Creating environment from YAML file

# Create environment from file
conda env create -f environment.yml

# Activate environment
conda activate data-science-project

# Verify private packages installed
pip list | grep your-private

Authentication Setup

CloudRepo supports multiple authentication methods for different scenarios:

Secure Credential Storage with keyring

Using keyring for secure credential storage

# Install keyring in your conda environment
conda activate myproject
pip install keyring

# Store credentials securely
keyring set https://[org-id].mycloudrepo.io/repositories/[repo-id]/simple username
# Enter your password when prompted

# Configure pip to use keyring
pip config set global.index-url https://[org-id].mycloudrepo.io/repositories/[repo-id]/simple

Using .netrc for Authentication

~/.netrc file configuration

machine [org-id].mycloudrepo.io
login your-username
password your-password

Secure the .netrc file

chmod 600 ~/.netrc

Environment Variables for CI/CD

CI/CD authentication using environment variables

# Set in your CI/CD system's secret management
export CLOUDREPO_USERNAME=your-username
export CLOUDREPO_PASSWORD=your-password

# Use in pip commands
pip install --index-url https://${CLOUDREPO_USERNAME}:${CLOUDREPO_PASSWORD}@[org-id].mycloudrepo.io/repositories/[repo-id]/simple your-package

Publishing Python Packages for Conda Users

Building Conda-Compatible Wheels

When publishing Python packages for Conda users, ensure compatibility by following these best practices:

setup.py - Conda-compatible package configuration

from setuptools import setup, find_packages
import sys

# Detect if running in conda environment
is_conda = sys.prefix != sys.base_prefix or 'conda' in sys.prefix

setup(
    name='your-data-science-library',
    version='1.0.0',
    packages=find_packages(),
    python_requires='>=3.8,<3.12',
    install_requires=[
        'numpy>=1.20,<2.0',
        'pandas>=1.3,<3.0',
        'scikit-learn>=1.0',
    ],
    extras_require={
        'viz': ['matplotlib>=3.5', 'seaborn>=0.12'],
        'gpu': ['cupy>=10.0'],
        'dev': ['pytest>=7.0', 'black>=22.0', 'mypy>=1.0'],
    },
    # Metadata
    author='Your Team',
    author_email='team@company.com',
    description='Internal data science utilities',
    long_description=open('README.md').read(),
    long_description_content_type='text/markdown',
    classifiers=[
        'Development Status :: 4 - Beta',
        'Intended Audience :: Science/Research',
        'Programming Language :: Python :: 3.8',
        'Programming Language :: Python :: 3.9',
        'Programming Language :: Python :: 3.10',
        'Programming Language :: Python :: 3.11',
    ],
)

pyproject.toml - Modern Python packaging

[build-system]
requires = ["setuptools>=61.0", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "your-data-science-library"
version = "1.0.0"
description = "Internal data science utilities"
readme = "README.md"
requires-python = ">=3.8,<3.12"
dependencies = [
    "numpy>=1.20,<2.0",
    "pandas>=1.3,<3.0",
    "scikit-learn>=1.0",
]

[project.optional-dependencies]
viz = ["matplotlib>=3.5", "seaborn>=0.12"]
gpu = ["cupy>=10.0"]
dev = ["pytest>=7.0", "black>=22.0", "mypy>=1.0"]

[tool.setuptools.packages.find]
include = ["your_package*"]

Publishing to CloudRepo’s PyPI Index

Upload your packages to CloudRepo using standard Python packaging tools:

Building and publishing with build and twine

# Install build tools in conda environment
conda activate myproject
pip install build twine

# Build your package
python -m build

# Upload to CloudRepo
twine upload \
  --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] \
  --username your-username \
  --password your-password \
  dist/*

Alternative: Direct upload with setup.py

# Legacy method (still widely used)
python setup.py sdist bdist_wheel upload \
  -r https://[org-id].mycloudrepo.io/repositories/[repo-id] \
  --username your-username \
  --password your-password

Version Management for Data Science Packages

Data science packages often require careful version management due to model compatibility:

Semantic versioning for ML models

# your_package/__version__.py
__version__ = "2.1.0"

# Version components:
# MAJOR.MINOR.PATCH
# 2 - Major model architecture changes
# 1 - New features, backward compatible
# 0 - Bug fixes, performance improvements

# Model versioning
MODEL_VERSION = "2.1"  # Tracks trained model compatibility
API_VERSION = "v2"     # Tracks API compatibility

Publishing pre-release versions for testing

# Beta releases for team testing
python setup.py sdist bdist_wheel
# Creates: your-package-2.1.0b1-py3-none-any.whl

# Upload beta to CloudRepo
twine upload --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] dist/*2.1.0b1*

# Install beta version explicitly
pip install your-package==2.1.0b1

Platform-Specific Packages

Data science teams often need platform-specific builds for performance:

.github/workflows/build-wheels.yml - Multi-platform builds

name: Build and Publish Platform Wheels

on:
  push:
    tags:
      - 'v*'

jobs:
  build_wheels:
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest, macos-latest]
        python-version: ['3.8', '3.9', '3.10', '3.11']

    runs-on: ${{ matrix.os }}

    steps:
      - uses: actions/checkout@v3

      - uses: conda-incubator/setup-miniconda@v2
        with:
          python-version: ${{ matrix.python-version }}
          activate-environment: build-env

      - name: Install dependencies
        run: |
          conda install -c conda-forge numpy cython
          pip install build twine

      - name: Build platform wheel
        run: python -m build --wheel

      - name: Upload to CloudRepo
        env:
          CLOUDREPO_USER: ${{ secrets.CLOUDREPO_USER }}
          CLOUDREPO_PASS: ${{ secrets.CLOUDREPO_PASS }}
        run: |
          twine upload \
            --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] \
            --username $CLOUDREPO_USER \
            --password $CLOUDREPO_PASS \
            dist/*.whl

Conda Environment Management with CloudRepo

Environment.yml with pip Dependencies from CloudRepo

Create reproducible environments that combine conda and CloudRepo packages:

Full data science environment with CloudRepo packages

name: ml-pipeline-prod
channels:
  - conda-forge
  - pytorch
  - nvidia
dependencies:
  # Core scientific stack
  - python=3.10.*
  - numpy=1.24.*
  - pandas=2.0.*
  - scipy=1.10.*

  # Machine Learning
  - scikit-learn=1.3.*
  - xgboost=1.7.*
  - lightgbm=4.0.*
  - pytorch=2.0.*
  - pytorch-cuda=11.8.*

  # Deep Learning
  - tensorflow=2.12.*
  - keras=2.12.*

  # Data Visualization
  - matplotlib=3.7.*
  - seaborn=0.12.*
  - plotly=5.14.*
  - bokeh=3.1.*

  # Jupyter Ecosystem
  - jupyter=1.0.*
  - jupyterlab=4.0.*
  - ipywidgets=8.0.*
  - nbconvert=7.4.*

  # Development Tools
  - black=23.3.*
  - pylint=2.17.*
  - pytest=7.3.*
  - pytest-cov=4.1.*

  # Pip for CloudRepo packages
  - pip
  - pip:
    # CloudRepo configuration
    - --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
    - --extra-index-url https://pypi.org/simple

    # Internal packages
    - company-ml-core==3.2.1
    - feature-pipeline==2.0.0
    - model-registry-client==1.5.0
    - data-quality-tools==1.1.0

    # Additional PyPI packages
    - mlflow==2.3.0
    - wandb==0.15.0
    - dvc==3.0.0

# Environment variables
variables:
  PYTHONPATH: /opt/custom/libs:$PYTHONPATH
  CUDA_HOME: /usr/local/cuda-11.8

Reproducible Environments

Ensure perfect reproducibility across team members and deployments:

Creating exact environment snapshots

# Export current environment with exact versions
conda list --explicit > conda-packages.txt
pip freeze > pip-packages.txt

# Create combined lock file
conda env export --no-builds > environment-lock.yml

environment-lock.yml - Pinned versions

name: ml-pipeline-prod
channels:
  - conda-forge
  - pytorch
dependencies:
  - python=3.10.11
  - numpy=1.24.3
  - pandas=2.0.2
  - pytorch=2.0.1
  - pip:
    - --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
    - company-ml-core==3.2.1
    - feature-pipeline==2.0.0

Mixing Conda and pip Packages

Best practices for combining conda and pip packages:

check_environment.py - Validate mixed environments

#!/usr/bin/env python
"""Verify conda/pip environment setup."""

import sys
import importlib
import subprocess
from pathlib import Path

def check_conda_packages():
    """Verify conda-installed packages."""
    required_conda = ['numpy', 'pandas', 'scipy', 'sklearn']

    result = subprocess.run(['conda', 'list'],
                           capture_output=True, text=True)
    conda_packages = result.stdout

    missing = []
    for pkg in required_conda:
        if pkg not in conda_packages:
            missing.append(pkg)

    return missing

def check_pip_packages():
    """Verify pip-installed CloudRepo packages."""
    required_pip = {
        'company_ml_core': '3.2.1',
        'feature_pipeline': '2.0.0',
    }

    missing = []
    wrong_version = []

    for pkg, required_version in required_pip.items():
        try:
            module = importlib.import_module(pkg)
            if hasattr(module, '__version__'):
                if module.__version__ != required_version:
                    wrong_version.append(f"{pkg} (found: {module.__version__}, required: {required_version})")
        except ImportError:
            missing.append(pkg)

    return missing, wrong_version

def main():
    print("Checking environment setup...")

    # Check conda packages
    missing_conda = check_conda_packages()
    if missing_conda:
        print(f"❌ Missing conda packages: {', '.join(missing_conda)}")
        print("   Run: conda install " + ' '.join(missing_conda))
    else:
        print("✅ All conda packages installed")

    # Check pip packages
    missing_pip, wrong_version = check_pip_packages()
    if missing_pip:
        print(f"❌ Missing CloudRepo packages: {', '.join(missing_pip)}")
    if wrong_version:
        print(f"❌ Wrong versions: {', '.join(wrong_version)}")

    if not missing_pip and not wrong_version:
        print("✅ All CloudRepo packages correctly installed")

    # Check Python version
    python_version = f"{sys.version_info.major}.{sys.version_info.minor}"
    print(f"📊 Python version: {python_version}")

    return len(missing_conda) + len(missing_pip) + len(wrong_version)

if __name__ == "__main__":
    sys.exit(main())

Lock Files with conda-lock

Use conda-lock for advanced dependency locking across platforms:

Installing and using conda-lock

# Install conda-lock
conda install -c conda-forge conda-lock

# Generate lock file from environment.yml
conda-lock lock --file environment.yml --platform linux-64 --platform osx-64 --platform win-64

conda-lock.yml - Multi-platform lock file

# Generated by conda-lock
version: 1
metadata:
  content_hash:
    linux-64: abc123...
    osx-64: def456...
    win-64: ghi789...
  channels:
    - conda-forge
    - pytorch
  platforms:
    - linux-64
    - osx-64
    - win-64
package:
  - name: python
    version: 3.10.11
    manager: conda
    platform: linux-64
    url: https://conda.anaconda.org/conda-forge/linux-64/python-3.10.11-h7a1cb2a_2.tar.bz2
    hash:
      md5: abc123...
  - name: company-ml-core
    version: 3.2.1
    manager: pip
    url: https://[org-id].mycloudrepo.io/repositories/[repo-id]/simple

Creating environments from lock files

# Install from conda-lock file
conda-lock install --name ml-pipeline-prod conda-lock.yml

# Or render to platform-specific file
conda-lock render --platform linux-64
conda env create --file conda-linux-64.lock.yml

Advanced Topics

Using Mamba for Faster Dependency Resolution

Mamba is a drop-in replacement for conda with 10-100x faster dependency resolution:

Installing and using mamba

# Install mamba in base environment
conda install -n base -c conda-forge mamba

# Use mamba instead of conda
mamba env create -f environment.yml
mamba activate ml-pipeline-prod

# Mamba handles complex dependencies much faster
mamba install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

.mambarc - Mamba configuration

channels:
  - conda-forge
  - pytorch
  - nvidia
channel_priority: strict
pip_interop_enabled: true

# CloudRepo pip configuration
pip_args: |
  --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
  --extra-index-url https://pypi.org/simple

Docker Containers with Conda and CloudRepo

Create reproducible Docker containers for deployment:

Dockerfile - Conda with CloudRepo packages

FROM continuumio/miniconda3:latest

# Set working directory
WORKDIR /app

# Copy environment file
COPY environment.yml .

# Create conda environment
RUN conda env create -f environment.yml && \
    conda clean -afy && \
    find /opt/conda/ -follow -type f -name '*.a' -delete && \
    find /opt/conda/ -follow -type f -name '*.pyc' -delete && \
    find /opt/conda/ -follow -type f -name '*.js.map' -delete

# Activate environment by default
SHELL ["conda", "run", "-n", "ml-pipeline-prod", "/bin/bash", "-c"]

# Install CloudRepo packages with credentials
ARG CLOUDREPO_USER
ARG CLOUDREPO_PASS
RUN pip install \
    --index-url https://${CLOUDREPO_USER}:${CLOUDREPO_PASS}@[org-id].mycloudrepo.io/repositories/[repo-id]/simple \
    --extra-index-url https://pypi.org/simple \
    company-ml-core==3.2.1 \
    feature-pipeline==2.0.0

# Copy application code
COPY src/ ./src/
COPY models/ ./models/

# Set entrypoint
ENTRYPOINT ["conda", "run", "--no-capture-output", "-n", "ml-pipeline-prod"]
CMD ["python", "src/main.py"]

docker-compose.yml - Multi-container ML pipeline

version: '3.8'

services:
  model-training:
    build:
      context: .
      args:
        CLOUDREPO_USER: ${CLOUDREPO_USER}
        CLOUDREPO_PASS: ${CLOUDREPO_PASS}
    environment:
      - CUDA_VISIBLE_DEVICES=0
      - MLFLOW_TRACKING_URI=http://mlflow:5000
    volumes:
      - ./data:/app/data
      - ./outputs:/app/outputs
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

  jupyter:
    build:
      context: .
      dockerfile: Dockerfile.jupyter
    ports:
      - "8888:8888"
    volumes:
      - ./notebooks:/app/notebooks
    environment:
      - JUPYTER_ENABLE_LAB=yes

Jupyter Notebooks with Private Packages

Configure Jupyter to use CloudRepo packages:

jupyter_notebook_config.py

import os
import sys

# Add CloudRepo authentication to notebook environment
c.NotebookApp.env_vars = {
    'PIP_INDEX_URL': 'https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple',
    'PIP_EXTRA_INDEX_URL': 'https://pypi.org/simple'
}

Install packages within Jupyter notebook

# Cell 1: Install private packages from CloudRepo
import sys
!{sys.executable} -m pip install \
    --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple \
    --extra-index-url https://pypi.org/simple \
    company-ml-core==3.2.1

# Cell 2: Import and use
import company_ml_core
from company_ml_core import FeatureEngine, ModelRegistry

# Initialize feature engine
engine = FeatureEngine()
features = engine.generate_features(df)

JupyterLab extensions for better experience

# Install useful extensions
conda activate ml-pipeline-prod

# Variable inspector
jupyter labextension install @lckr/jupyterlab_variableinspector

# Git integration
pip install jupyterlab-git

# Code formatting
pip install jupyterlab-code-formatter black isort

Multi-Environment Projects

Manage multiple environments for different stages:

environments/dev.yml - Development environment

name: project-dev
channels:
  - conda-forge
dependencies:
  - python=3.11  # Latest for development
  - numpy
  - pandas
  - jupyter
  - pytest
  - black
  - mypy
  - pip:
    - --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
    - -e ../company-ml-core  # Editable local install

environments/prod.yml - Production environment

name: project-prod
channels:
  - conda-forge
dependencies:
  - python=3.10.11  # Pinned for production
  - numpy=1.24.3
  - pandas=2.0.2
  - pip:
    - --index-url https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
    - company-ml-core==3.2.1  # Pinned version

manage-environments.sh - Environment management script

#!/bin/bash

# Environment management script

create_env() {
    local env_name=$1
    local env_file=$2

    echo "Creating environment: $env_name from $env_file"
    mamba env create -f $env_file
    echo "✅ Environment $env_name created"
}

update_env() {
    local env_name=$1
    local env_file=$2

    echo "Updating environment: $env_name"
    mamba env update -n $env_name -f $env_file --prune
    echo "✅ Environment $env_name updated"
}

# Create all environments
create_all() {
    create_env project-dev environments/dev.yml
    create_env project-test environments/test.yml
    create_env project-prod environments/prod.yml
}

# Main script
case "$1" in
    create)
        create_env $2 $3
        ;;
    update)
        update_env $2 $3
        ;;
    create-all)
        create_all
        ;;
    *)
        echo "Usage: $0 {create|update|create-all} [env-name] [env-file]"
        exit 1
esac

Future: Native Conda Channel Support

CloudRepo’s upcoming native Conda channel support will enable:

Binary Package Hosting

Host compiled packages with C/C++ extensions, CUDA libraries, and platform-specific optimizations directly as conda packages.

Custom Conda Channels

Create private conda channels alongside conda-forge and defaults:

Future: Native conda channel configuration

channels:
  - https://[org-id].mycloudrepo.io/conda/[channel-name]
  - conda-forge
  - defaults

Conda Package Building

Build and publish conda packages directly:

Future: Publishing conda packages

# Build conda package
conda build recipe/

# Upload to CloudRepo conda channel
anaconda upload --channel cloudrepo package.tar.bz2

Mixed Language Packages

Support packages that combine Python with R, Julia, or other languages in a single conda package.

This roadmap ensures CloudRepo remains the most cost-effective and feature-complete solution for teams using Conda in production.

CI/CD Integration

GitHub Actions with Conda

Integrate CloudRepo with GitHub Actions for automated testing and deployment:

.github/workflows/conda-ci.yml

name: Conda CI/CD Pipeline

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest, macos-latest]
        python-version: ['3.9', '3.10', '3.11']

    steps:
      - uses: actions/checkout@v3

      - name: Setup Miniconda
        uses: conda-incubator/setup-miniconda@v2
        with:
          python-version: ${{ matrix.python-version }}
          miniforge-version: latest
          activate-environment: test-env
          environment-file: environment.yml
          auto-activate-base: false

      - name: Configure CloudRepo
        shell: bash -l {0}
        run: |
          pip config set global.index-url https://${{ secrets.CLOUDREPO_USER }}:${{ secrets.CLOUDREPO_PASS }}@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
          pip config set global.extra-index-url https://pypi.org/simple

      - name: Install dependencies
        shell: bash -l {0}
        run: |
          conda info
          conda list
          pip install -e .[dev]

      - name: Run tests
        shell: bash -l {0}
        run: |
          pytest tests/ -v --cov=your_package --cov-report=xml

      - name: Upload coverage
        uses: codecov/codecov-action@v3
        with:
          file: ./coverage.xml

  publish:
    needs: test
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'

    steps:
      - uses: actions/checkout@v3

      - name: Setup Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'

      - name: Build package
        run: |
          pip install build
          python -m build

      - name: Publish to CloudRepo
        run: |
          pip install twine
          twine upload \
            --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] \
            --username ${{ secrets.CLOUDREPO_USER }} \
            --password ${{ secrets.CLOUDREPO_PASS }} \
            dist/*

GitLab CI

Configure GitLab CI for Conda and CloudRepo:

.gitlab-ci.yml

image: continuumio/miniconda3:latest

variables:
  PIP_INDEX_URL: https://${CLOUDREPO_USER}:${CLOUDREPO_PASS}@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
  PIP_EXTRA_INDEX_URL: https://pypi.org/simple

before_script:
  - conda env create -f environment.yml
  - conda activate ml-pipeline

stages:
  - test
  - build
  - deploy

test:unit:
  stage: test
  script:
    - conda activate ml-pipeline
    - pytest tests/unit -v --junitxml=report.xml
  artifacts:
    reports:
      junit: report.xml

test:integration:
  stage: test
  script:
    - conda activate ml-pipeline
    - pytest tests/integration -v

build:package:
  stage: build
  script:
    - conda activate ml-pipeline
    - python -m build
  artifacts:
    paths:
      - dist/
    expire_in: 1 week

deploy:cloudrepo:
  stage: deploy
  only:
    - main
    - tags
  script:
    - pip install twine
    - twine upload \
        --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] \
        --username ${CLOUDREPO_USER} \
        --password ${CLOUDREPO_PASS} \
        dist/*

Azure ML Pipelines

Integrate with Azure Machine Learning pipelines:

azure-pipelines.yml

trigger:
  branches:
    include:
      - main
      - develop

pool:
  vmImage: 'ubuntu-latest'

variables:
  - group: cloudrepo-credentials

stages:
- stage: Build
  jobs:
  - job: CondaBuild
    strategy:
      matrix:
        Python39:
          python.version: '3.9'
        Python310:
          python.version: '3.10'
        Python311:
          python.version: '3.11'

    steps:
    - task: UsePythonVersion@0
      inputs:
        versionSpec: '$(python.version)'

    - bash: |
        wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh
        bash miniconda.sh -b -p $HOME/miniconda
        export PATH="$HOME/miniconda/bin:$PATH"
        conda env create -f environment.yml
      displayName: 'Setup Conda Environment'

    - bash: |
        source $HOME/miniconda/bin/activate ml-pipeline
        pip config set global.index-url https://$(CLOUDREPO_USER):$(CLOUDREPO_PASS)@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
        pip install -e .[dev]
      displayName: 'Install Dependencies'

    - bash: |
        source $HOME/miniconda/bin/activate ml-pipeline
        pytest tests/ --junitxml=junit/test-results.xml --cov --cov-report=xml
      displayName: 'Run Tests'

    - task: PublishTestResults@2
      inputs:
        testResultsFiles: '**/test-results.xml'

    - task: PublishCodeCoverageResults@1
      inputs:
        codeCoverageTool: 'Cobertura'
        summaryFileLocation: '$(System.DefaultWorkingDirectory)/coverage.xml'

- stage: Deploy
  condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
  jobs:
  - job: PublishPackage
    steps:
    - bash: |
        pip install build twine
        python -m build
        twine upload \
          --repository-url https://[org-id].mycloudrepo.io/repositories/[repo-id] \
          --username $(CLOUDREPO_USER) \
          --password $(CLOUDREPO_PASS) \
          dist/*
      displayName: 'Publish to CloudRepo'

Troubleshooting Common Issues

Authentication Issues

Problem: 401 Unauthorized errors when installing packages

Solution: Verify credentials and URL format

# Check current pip configuration
pip config list

# Verify URL format (note the /simple suffix for PyPI)
# Correct: https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]/simple
# Wrong:   https://username:password@[org-id].mycloudrepo.io/repositories/[repo-id]

# Test authentication
curl -u username:password https://[org-id].mycloudrepo.io/repositories/[repo-id]/simple

Problem: Credentials visible in conda environment export

Solution: Use environment variables or keyring

# Use environment variables
export CLOUDREPO_INDEX=https://[org-id].mycloudrepo.io/repositories/[repo-id]/simple

# In environment.yml, reference without credentials
dependencies:
  - pip:
    - --index-url ${CLOUDREPO_INDEX}

Package Resolution Conflicts

Problem: Conda and pip package versions conflict

Solution: Install conda packages first, then pip

# Best practice order
conda install numpy pandas scikit-learn
pip install your-private-package

# Never do this (pip first, then conda)
pip install numpy
conda install numpy  # May downgrade or conflict

Problem: Dependency resolver timeout with complex requirements

Solution: Use mamba or simplify requirements

# Install mamba for faster resolution
conda install -c conda-forge mamba

# Use mamba instead
mamba env create -f environment.yml

# Or simplify by pinning major versions only
dependencies:
  - numpy=1.24.*  # Instead of exact version
  - pandas=2.*

SSL/TLS Certificate Issues

Problem: SSL verification errors with corporate proxies

Solution: Configure certificates properly

# Export corporate certificate
export REQUESTS_CA_BUNDLE=/path/to/corporate-ca-bundle.crt
export SSL_CERT_FILE=/path/to/corporate-ca-bundle.crt

# Or configure in pip
pip config set global.cert /path/to/corporate-ca-bundle.crt

# For conda
conda config --set ssl_verify /path/to/corporate-ca-bundle.crt

Environment Reproducibility

Problem: Different team members get different package versions

Solution: Use lock files and exact versions

# Generate exact environment snapshot
conda list --explicit > conda-explicit.txt
pip freeze > requirements-lock.txt

# Recreate exact environment
conda create --name myenv --file conda-explicit.txt
conda activate myenv
pip install -r requirements-lock.txt

Problem: Builds fail in CI but work locally

Solution: Match CI environment exactly

# Export local environment
conda env export --no-builds > ci-environment.yml

# In CI, use exact same environment
conda env create -f ci-environment.yml

Performance Issues

Problem: Slow package downloads from CloudRepo

Solution: Use pip caching and parallel downloads

# Enable pip cache
pip config set global.cache-dir /path/to/cache

# Use parallel downloads (pip 20.3+)
pip install --use-feature=fast-deps your-package

# For conda, use mamba
mamba install your-packages

Problem: Large conda environments taking too much space

Solution: Clean conda cache and use hard links

# Clean package cache
conda clean --all

# Use hard links to save space
conda config --set allow_softlinks false

# Pack environment for distribution
conda pack -n myenv -o myenv.tar.gz

Common Error Messages

Error reference and solutions

ERROR: Could not find a version that satisfies the requirement
→ Check package name spelling and version availability
→ Verify repository URL includes /simple for PyPI

ERROR: No matching distribution found
→ Package might not exist in CloudRepo
→ Check if package supports your Python version/platform

CondaHTTPError: HTTP 401 UNAUTHORIZED
→ Invalid credentials or expired token
→ Create new repository user in CloudRepo admin

ResolvePackageNotFound
→ Package not available in specified channels
→ Add appropriate channel or install via pip

InvalidVersionSpec
→ Version specification syntax error
→ Use correct syntax: ==, >=, ~=, etc.

Summary

CloudRepo provides robust support for Conda and Miniconda users through its PyPI repository functionality, enabling data science teams to:

Securely host and share proprietary Python packages within Conda environments
Combine the best of both worlds by using conda-forge for optimized binaries and CloudRepo for private packages
Maintain reproducible environments across development, testing, and production
Integrate seamlessly with existing CI/CD pipelines and MLOps workflows
Save significantly on infrastructure costs compared to enterprise alternatives

With native Conda channel support on the roadmap, CloudRepo is positioned to become the complete package management solution for data science and research teams, offering enterprise-grade features at a fraction of the cost of competitors.

For additional support or questions about using CloudRepo with Conda, please contact our support team at support@cloudrepo.io.