Installation¶
Prerequisites¶
- uv - Fast Python package manager
- Java 17+ (for PySpark)
- You may need to set
JAVA_HOMEenvironment variable (e.g.export JAVA_HOME="/Users/$(whoami)/Library/Java/JavaVirtualMachines/corretto-17.0.9/Contents/Home")
- You may need to set
- Node.js 22+ (for CDK)
- AWS CLI configured with appropriate credentials (for deployment)
Installing uv¶
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Or with Homebrew
brew install uv
# Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
Quick Setup¶
The fastest way to get started:
# Clone the repository
git clone https://github.com/caverac/insurance-fraud-detection.git
cd insurance-fraud-detection
# Install everything (Python + Node.js + pre-commit hooks)
make install
This single command:
- Creates a virtual environment and installs all Python dependencies via
uv sync - Installs pre-commit hooks for code quality
- Installs Node.js dependencies for CDK in
packages/infra
Manual Installation¶
If you prefer to install components individually:
1. Python Dependencies¶
2. Pre-commit Hooks¶
3. Node.js Dependencies (for CDK)¶
Verify Installation¶
# Check fraud detection CLI
uv run fraud-detect --help
# Check Spark
uv run python -c "from pyspark.sql import SparkSession; print('PySpark OK')"
# Check CDK
cd packages/infra && yarn cdk --version
# Verify pre-commit hooks
uv run pre-commit run --all-files
# Run tests
uv run pytest
Common uv Commands¶
# Sync dependencies (install/update)
uv sync
# Run a command in the virtual environment
uv run <command>
# Add a new dependency
uv add <package>
# Add a dev dependency
uv add --dev <package>
# Update dependencies
uv lock --upgrade
uv sync
CDK Commands¶
CDK commands are run from the packages/infra directory:
cd packages/infra
# Synthesize CloudFormation templates
yarn synth
# Deploy all stacks
yarn deploy
# Show differences
yarn diff
# Bootstrap (first-time setup)
yarn bootstrap
Or use make targets from the project root:
IDE Configuration¶
VS Code¶
Recommended extensions:
- Python
- Pylance
- Black Formatter
- Pylint
- AWS Toolkit
Settings (.vscode/settings.json):
{
"python.defaultInterpreterPath": ".venv/bin/python",
"python.analysis.typeCheckingMode": "basic",
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true
},
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit"
}
}
PyCharm¶
- Set Python interpreter to
.venv/bin/python - Mark
packages/*/srcas Sources Root - Enable Black as external formatter
- Enable Pylint inspections
Troubleshooting¶
PySpark Issues¶
If you encounter Java-related errors:
# On macOS with Homebrew
brew install openjdk@17
export JAVA_HOME=/opt/homebrew/opt/openjdk@17
# On Ubuntu
sudo apt install openjdk-17-jdk
export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
CDK Bootstrap¶
First-time CDK deployment requires bootstrapping:
Permission Issues¶
Ensure your AWS credentials have sufficient permissions for:
- S3 bucket operations
- EMR cluster management
- Glue catalog access
- IAM role creation
Pre-commit Hook Failures¶
If pre-commit hooks fail, you can run them manually to see details:
# Run all hooks
uv run pre-commit run --all-files
# Run specific hook
uv run pre-commit run black --all-files
uv run pre-commit run pylint --all-files