Docs
Data Pipelines
Integrations
AWS Raw Pipeline

AWS Raw Pipeline

To set up a raw export pipeline to an S3 bucket from Mixpanel, you must configure S3 to receive the exported data, then create a pipeline (opens in a new tab) to export the data.

The following document summarizes the steps to edit S3 bucket permissions so that it accepts the Mixpanel export. Consult AWS documentation (opens in a new tab) for any AWS specific tasks, such as creating an S3 bucket (opens in a new tab) and editing permissions (opens in a new tab).

To prepare S3 for the incoming data you must:

  1. Create an S3 bucket.
  2. Give Mixpanel the required permissions to write to the bucket.

S3 Bucket Permissions

Mixpanel supports a wide range of configurations to secure and manage your data on S3. To access resources, the pipeline uses AWS cross-account roles.

This section highlights the permissions you must give Mixpanel depending on the configuration of the target S3 bucket.

Data Modification Policy

All exports from Mixpanel to AWS require that you create a new data modification policy, or add the following permissions to an existing data modification policy.

Replacing <BUCKET_NAME> and <SID> with your bucket name and chosen sid before inserting this JSON:

{
    "Version": "VERSION",
    "Statement": [
        {
            "Sid": "<SID>",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<BUCKET-NAME>",
                "arn:aws:s3:::<BUCKET-NAME>/*"
            ]
        }
    ]
}

Server-Side Encryption

Mixpanel always sends data to your S3 bucket on a TLS encrypted connection. To secure your data at rest on S3, you can enable Server-Side Encryption (SSE) (opens in a new tab).

There are two options when using SSE: Encryption with Amazon S3-Managed Keys (SSE-S3) and Encryption with AWS KMS-Managed Keys (SSE-KMS)

Encryption with Amazon S3-Managed Keys (SSE-S3)

This setting on your bucket encrypts data at rest using the AES-256 algorithm that uses keys managed by S3.

If you are using this type of SSE, you only need to configure your pipeline by passing the s3_encryption=aes parameter when calling the Mixpanel Data Warehouse Export API. See AWS S3 and Glue Parameters (opens in a new tab).

Encryption with AWS KMS-Managed Keys (SSE-KMS)

For S3 buckets, you can pick a default key named aws/s3. If you opt to use the default key you need to configure your pipeline by passing s3_encryption=kms and s3_kms_key_id=<aws managed default CMK> when calling the Mixpanel Data Warehouse Export API or you can choose to use your own custom keys for encrypting the contents of your bucket

In either case you will need to allow Mixpanel to use the key to encrypt the data properly as it is written to your bucket.

To achieve this, create an IAM policy that gives permission to Mixpanel to use the KMS key. Use the following JSON snippet and replace <KEY_ARN> and <SID> with your custom key's (or aws managed default CMK's) ARN and chosen sid:

{
    "Version": "VALUE",
    "Statement": [
        {
            "Sid": "<SID>",
            "Effect": "Allow",
            "Action": [
                "kms:Decrypt",
                "kms:Encrypt",
                "kms:GenerateDataKey",
                "kms:ReEncryptTo",
                "kms:GenerateDataKeyWithoutPlaintext",
                "kms:DescribeKey",
                "kms:ReEncryptFrom"
            ],
            "Resource": "<KEY-ARN>"
        }
    ]
}

You must configure your pipeline by passing s3_encryption=kms and s3_kms_key_id=<KEY_ARN> when calling the Mixpanel Data Warehouse Export API.

S3 Access Role

After creating the policies in the sections above, you must create a cross account IAM Role to assign the policies to the role.

  • Go to the _AWS IAM _service on the console.
  • Click Roles in the sidebar.
  • Click Create Role.
  • Select **Other AWS Accounts **on the trust policy page and enter "485438090326" for the account ID.
  • In the Permissions page, find and select the policies you created above.
  • In the _Review _page, enter a name and description for the role and click Save.

Next, limit the trust relationship to the Mixpanel export user to ensure only Mixpanel has the ability to assume this specific role.

  • Navigate to the AWS IAM service in the console.
  • Click Roles in the sidebar.
  • Find and click the role you just created.
  • Navigate to the Trust Relationships tab.
  • Click Edit trust relationship.
  • Replace the contents with the following JSON:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::485438090326:user/mixpanel-export"
      },
      "Action": "sts:AssumeRole",
      "Condition": {}
    }
  ]
}

Using AWS External ID

Amazon introduced the use of external id for cross-account access because of the confused deputy problem (opens in a new tab). As Mixpanel uses cross account access to export data, you can make use of this feature to make the data transfer more secure.

Mixpanel uses your project token as external ID when talking to AWS. In order to enable this, you simply need to edit the trust relationship you created as part of the previous step and add a condition to check the passed external id is in fact your Mixpanel project token. So, the final JSON for your trust relationship would be:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::485438090326:user/mixpanel-export"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "<MIXPANEL_PROJECT_TOKEN>"
        }
      }
    }
  ]
}

Use The Data Pipelines API

After permissions have been granted, use the Data Pipelines API (opens in a new tab) to create the pipeline. Here is an example request:

curl https://data.mixpanel.com/api/2.0/nessie/pipeline/create \
-u API_SECRET: \
-d type="s3-raw" \
-d from_date="2019-08-10" \
-d s3_bucket="example-s3-export" \
-d s3_region="us-west-2" \
-d s3_prefix="test" \
-d s3_role="arn:aws:iam::<account-id>:role/example-s3-role" \
-d s3_encryption="aes" \

Was this page useful?