Performance Tuning

Optimize CloudRepo performance for your specific use cases.

Overview

While CloudRepo is optimized out-of-the-box, these tuning tips can further improve performance for large-scale deployments and specific scenarios.

Network Optimization

Connection Pooling

Configure your clients to reuse connections:

Maven:

<settings>
  <servers>
    <server>
      <id>cloudrepo</id>
      <configuration>
        <httpConfiguration>
          <all>
            <connectionTimeout>60000</connectionTimeout>
            <requestTimeout>60000</requestTimeout>
            <httpClient>
              <maxConnections>20</maxConnections>
            </httpClient>
          </all>
        </httpConfiguration>
      </configuration>
    </server>
  </servers>
</settings>

Python:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
adapter = HTTPAdapter(
    pool_connections=10,
    pool_maxsize=20,
    max_retries=Retry(total=3, backoff_factor=0.3)
)
session.mount('https://', adapter)

Parallel Operations

Parallel Downloads (Maven):

mvn -T 4 clean install  # Use 4 threads

Parallel Uploads (Python):

from concurrent.futures import ThreadPoolExecutor
import requests

def upload_file(file_path, url, auth):
    with open(file_path, 'rb') as f:
        requests.put(url, data=f, auth=auth)

with ThreadPoolExecutor(max_workers=5) as executor:
    futures = []
    for file in files:
        future = executor.submit(upload_file, file, url, auth)
        futures.append(future)

Client-Side Caching

Local Repository Cache

Maven: Configure local repository:

<settings>
  <localRepository>/path/to/fast/ssd/repository</localRepository>
</settings>

Gradle: Configure cache:

gradle.projectsLoaded {
    rootProject.allprojects {
        buildDir = "/path/to/fast/ssd/build/${rootProject.name}/${project.name}"
    }
}

pip: Configure cache directory:

export PIP_CACHE_DIR=/path/to/fast/ssd/pip-cache

Proxy Repository Optimization

Cache Configuration

For proxy repositories caching external dependencies:

  1. Set appropriate TTL (Time To Live): * Releases: 1440 minutes (24 hours) * Snapshots: 10 minutes * Metadata: 30 minutes

  2. Pre-fetch critical dependencies:

# Pre-populate proxy cache
mvn dependency:go-offline
  1. Use repository groups to check local before proxy

CDN Utilization

Geographic Optimization

CloudRepo automatically routes to nearest CDN edge. Ensure:

  1. DNS resolution uses CloudRepo’s DNS

  2. Avoid proxies that might route incorrectly

  3. Check latency to different regions:

# Test latency
ping your-org.cloudrepo.io
traceroute your-org.cloudrepo.io

Build Tool Configuration

Maven Optimization

<settings>
  <!-- Increase wagon threads -->
  <properties>
    <maven.wagon.threads>10</maven.wagon.threads>
    <maven.wagon.http.pool>10</maven.wagon.http.pool>
    <maven.wagon.httpconnectionManager.ttlSeconds>120</maven.wagon.httpconnectionManager.ttlSeconds>
  </properties>
</settings>

Gradle Optimization

// gradle.properties
org.gradle.parallel=true
org.gradle.caching=true
org.gradle.daemon=true
org.gradle.configureondemand=true
org.gradle.workers.max=4

Python/pip Optimization

# Use faster installer
pip install --upgrade pip
pip config set global.trusted-host your-org.cloudrepo.io
pip config set global.timeout 120

Upload Performance

Batch Operations

Instead of individual uploads, batch them:

import tarfile
import requests

# Create archive of multiple files
with tarfile.open('batch.tar.gz', 'w:gz') as tar:
    for file in files:
        tar.add(file)

# Single upload
with open('batch.tar.gz', 'rb') as f:
    requests.put(upload_url, data=f, auth=auth)

Compression

Enable compression for text-based artifacts:

# Compress before upload
gzip -9 large-file.xml
curl -u username:password \
     --upload-file large-file.xml.gz \
     --header "Content-Encoding: gzip" \
     https://your-org.cloudrepo.io/repository/raw/large-file.xml.gz

Download Performance

Range Requests

For large files, use range requests:

import requests

def download_in_chunks(url, auth, chunk_size=8192):
    response = requests.get(url, auth=auth, stream=True)
    for chunk in response.iter_content(chunk_size=chunk_size):
        if chunk:
            process_chunk(chunk)

Concurrent Downloads

# Using aria2 for parallel downloads
aria2c -x 5 -s 5 \
       --http-user=username \
       --http-passwd=password \
       https://your-org.cloudrepo.io/repository/raw/large-file.zip

Repository Organization

Optimize Structure

  1. Separate by stability: releases vs snapshots

  2. Separate by team: reduce repository size

  3. Use specific repositories vs catch-all

  4. Regular cleanup of old snapshots

Retention Policies

Configure automatic cleanup:

  • Keep last 10 snapshot versions

  • Remove snapshots older than 30 days

  • Archive old releases to cold storage

Monitoring Performance

Measure Metrics

Track key performance indicators:

import time
import requests

def measure_performance(url, auth):
    start = time.time()
    response = requests.get(url, auth=auth)
    duration = time.time() - start

    return {
        'duration': duration,
        'size': len(response.content),
        'speed': len(response.content) / duration / 1024 / 1024  # MB/s
    }

CloudRepo Metrics

Monitor via CloudRepo dashboard:

  • Repository access patterns

  • Peak usage times

  • Geographic distribution

  • Large file transfers

CI/CD Optimization

Pipeline Caching

GitHub Actions:

- uses: actions/cache@v3
  with:
    path: ~/.m2/repository
    key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}

GitLab CI:

cache:
  paths:
    - .m2/repository
  key: "$CI_COMMIT_REF_SLUG"

Docker Layer Caching

# Multi-stage build with dependency caching
FROM maven:3.8-openjdk-11 as dependencies
COPY pom.xml .
RUN mvn dependency:go-offline

FROM maven:3.8-openjdk-11 as build
COPY --from=dependencies /root/.m2 /root/.m2
COPY . .
RUN mvn package

Troubleshooting Performance

Common Issues

Slow uploads: * Check network bandwidth * Use compression * Batch small files

Slow downloads: * Verify CDN is being used * Check for network proxies * Use local caching

High latency: * Check geographic routing * Verify DNS resolution * Consider dedicated infrastructure

Performance Testing

# Test upload speed
time curl -u username:password \
     --upload-file test-100mb.file \
     https://your-org.cloudrepo.io/repository/raw/test.file

# Test download speed
time curl -u username:password \
     -o /dev/null \
     https://your-org.cloudrepo.io/repository/raw/test.file

Best Practices Summary

  1. Use connection pooling and keep-alive

  2. Enable parallel operations where possible

  3. Implement local caching strategically

  4. Optimize repository structure and retention

  5. Monitor performance metrics regularly

  6. Use CDN effectively via proper DNS

  7. Batch operations when possible

  8. Compress large text files

  9. Configure appropriate timeouts

  10. Cache in CI/CD pipelines

Getting Help

For performance issues:

Next Steps