AWS Infrastructure Visualization with Cartography

Cartography is a Python-based infrastructure mining/visualization tool authored by Lyft.

Supported platforms: https://github.com/lyft/cartography#supported-platforms

This post covers the steps for running a local instance to analyze a single AWS account.

Configure AWS CLI Profile for Target Account

Create an AWS profile CLI for the target account. The IAM requirements are described at Single AWS Account Setup.

Cartography and Neo4j: Prepare Docker files

To get up and running, clone the official repository locally:

$ cd $HOME
$ git clone https://github.com/lyft/cartography.git
$ cd cartography

We’ll make some slight modifications to the existing Dockerfile and docker-compose.yml files.

These files are included below and are intended for analyzing a single AWS account. Neo4j URL and AWS profile are configured as environment variables.

Dockerfile

$HOME/cartography/Dockerfile
FROM ubuntu:focal

ENV AWS_PROFILE=${AWS_PROFILE:-}
ENV NEO4J_URL=${NEO4J_URL:-}

# Create cartography user so that we can give it ownership of the directory later for unit&integ tests
RUN groupadd cartography &&
    useradd -s /bin/bash -d /home/cartography -m -g cartography cartography

ENV PATH=/home/cartography/.local/bin:/venv/bin:$PATH

RUN apt-get update &&
    DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends python3.8-dev python3-pip python3-setuptools openssl libssl-dev gcc pkg-config libffi-dev libxml2-dev libxmlsec1-dev curl make git &&
    apt-get clean &&
    rm -rf /var/lib/apt/lists/*

COPY . /home/cartography/srv/cartography

# Installs pip supported by python3.8

RUN chown -R cartography:cartography /home/cartography
RUN chmod -R a+w /home/cartography/srv/cartography

WORKDIR /home/cartography/srv/cartography

USER cartography

RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && python3.8 get-pip.py

RUN pip install -U -e .

ENTRYPOINT echo $(whereis cartography) && AWS_PROFILE=${AWS_PROFILE} cartography -v --neo4j-uri ${NEO4J_URL}

Building the Docker image:

$ cd $HOME/cartography
$ docker build -t lyft/cartography .

docker-compose.yml

Be sure to replace <your aws profile> with the AWS profile name created previously.

  - AWS_PROFILE=<your aws profile>
$HOME/cartography/docker-compose.yml
version: "3.7"
services:
  neo4j:
    image: neo4j:4.4.5-community
    container_name: neo4j
    restart: unless-stopped
    ports:
      - 7474:7474
      - 7687:7687
    volumes:
      - ./.compose/neo4j/conf:/conf
      - ./.compose/neo4j/data:/data
      - ./.compose/neo4j/import:/import
      - ./.compose/neo4j/logs:/logs
      - ./.compose/neo4j/plugins:/plugins
    environment:
      # Raise memory limits:
      - NEO4J_dbms_memory_pagecache_size=1G
      - NEO4J_dbms.memory.heap.initial_size=1G
      - NEO4J_dbms_memory_heap_max__size=1G
      # Auth:
      - NEO4J_AUTH=none
      # Add APOC and GDS:
      - apoc.export.file.enabled=true
      - apoc.import.file.enabled=true
      - apoc.import.file.use_neo4j_config=true
      - NEO4JLABS_PLUGINS=["graph-data-science", "apoc"]
      - NEO4J_dbms_security_procedures_allowlist=gds.*, apoc.*
      - NEO4J_dbms_security_procedures_unrestricted=gds.*, apoc.*
      # Networking:
      - dbms.connector.bolt.listen_address=0.0.0.0:7687
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:7474"]
      interval: 10s
      timeout: 10s
      retries: 10
  cartography:
    # As seen in docs, we build with `cd /path/to/cartography && docker build -t lyft/cartography .`
    # and then run with `docker-compose up -d`.
    image: lyft/cartography
    container_name: cartghy    
    user: cartography
    init: true
    restart: on-failure
    depends_on:
      - neo4j
    volumes:
      - /home/lts/.aws:/home/cartography/.aws
    environment:
      # Point to the neo4j service defined in this docker-compose file.
      - NEO4J_URL=bolt://neo4j:7687
      - AWS_PROFILE=<your aws profile>

Note: Since this is a local exploration environment, Neo4j authentication has been disabled by setting the following environment variable:

- NEO4J_AUTH=none

Run Cartography/Neo4j Docker Compose Stack

  • Bring up the stack:
$ cd $HOME/cartography
$ docker compose up -d
  • Monitor the Cartography container logs:
$ docker logs --follow cartghy
  • Cartography will attempt connecting to Neo4j before commencing a scan of assets for the configured AWS account
  • During initialization, you may see errors in the log output:
neo4j.exceptions.ServiceUnavailable: Failed to establish connection to ResolvedIPv4Address(('172.xx.x.x', 7687)) 
(reason [Errno 111] Connection refused)
  • These can be safely ignored and are a result of the Cartography service attempting to connect to Neo4j while it’s still initializing
  • When Cartography commences the scan for the target account, the following lines will appear in the log output
INFO:cartography.sync:Starting sync stage 'aws'
INFO:cartography.intel.aws:Syncing AWS accounts: nnnnnnnnnnnn
INFO:cartography.intel.aws:Syncing AWS account with ID 'nnnnnnnnnnnn' using configured profile 'your aws profile'.
  • These indicate that Cartography is scanning the assets for AWS account (nnnnnnnnnnnn), using the associated profile
    ( ‘your aws profile‘)
  • Scanning can take a considerable amount of time for accounts containing a large number of assets/resources.
  • To check if the scan is complete, filter the log output for the following message using the sample command shown below
    • INFO:cartography.sync:Finishing sync stage 'aws'
$ echo "$(docker logs cartghy >/dev/stdout 2>&1)" | grep -F "INFO:cartography.sync:Finishing sync stage 'aws'"

Using Neo4j to Query the AWS Account Scan Results

The local Neo4j web interface can be accessed at http://localhost:7474. As previously mentioned, for our local discovery exercise, Neo4j authentication has been disabled.

From the top left hand corner, click RDBMS icon to display the list of available databases:

The neo4j RDBMS should appear in the left hand pane, along with the Node Labels for our target AWS account.

You can start exploration by clicking on the “Node Labels” and viewing the output in graphical, tabular or text format.

Running Queries

Querying account assets using Neo4j graph database syntax is also possible.

For example, the sample query below returns the name and arn of IAM roles in the account:

MATCH (n:AWSRole) RETURN n.name, n.arn LIMIT 100