DeveloperFebruary 2025 · 12 min read

Akeneo REST API Tutorial: Fetching and Exporting Products (2025)

A complete walkthrough of the Akeneo REST API from authentication to full catalog export — including product models, families, attributes, and pagination for large catalogs.

Contents

  1. 1.Authentication overview
  2. 2.Key API endpoints
  3. 3.Fetching products with pagination
  4. 4.Understanding the product data structure
  5. 5.Fetching families and attributes
  6. 6.Handling product models (variants)
  7. 7.Categories and reference entities
  8. 8.Exporting a full catalog to JSON
  9. 9.When to use the API vs a connector

1. Authentication overview

Akeneo uses OAuth2 password flow. You POST your credentials once to get a bearer token, then include it in every request. Tokens expire after 1 hour.

import requests
from base64 import b64encode

BASE_URL = "https://your-akeneo.com"
CLIENT_ID = "2_abc123"
CLIENT_SECRET = "your_secret"
USERNAME = "api_user"
PASSWORD = "your_password"

def get_token():
    creds = b64encode(f"{CLIENT_ID}:{CLIENT_SECRET}".encode()).decode()
    r = requests.post(
        f"{BASE_URL}/api/oauth/v1/token",
        headers={"Content-Type": "application/json",
                 "Authorization": f"Basic {creds}"},
        json={"username": USERNAME, "password": PASSWORD,
              "grant_type": "password"}
    )
    return r.json()["access_token"]

TOKEN = get_token()

Full authentication guide: Akeneo API Authentication

2. Key API endpoints

All endpoints use the base path https://your-akeneo.com/api/rest/v1/.

MethodEndpointReturns
GET/api/rest/v1/productsList all simple products (paginated). Includes attributes but NOT parent attributes.
GET/api/rest/v1/products/{code}Get a single product by identifier/SKU.
GET/api/rest/v1/product-modelsList all product models (root + sub models). Needed to resolve parent attributes.
GET/api/rest/v1/product-models/{code}Get a single product model by code.
GET/api/rest/v1/familiesList all product families with their attributes and labels.
GET/api/rest/v1/families/{code}/variantsList family variants (defines which attributes live at which level).
GET/api/rest/v1/attributesList all attribute definitions (type, labels, options for select attributes).
GET/api/rest/v1/attributes/{code}/optionsList options for select/multiselect attributes.
GET/api/rest/v1/categoriesList all categories with their tree structure.
GET/api/rest/v1/reference-entities/{code}/recordsList records for a reference entity (e.g., color, brand, material).

3. Fetching products with pagination

The products endpoint returns paginated results. Use limit=100 (max allowed) and follow the _links.next cursor to get all pages:

def fetch_all_products(token, search_after=None):
    """
    Uses search_after pagination (more efficient than page number for large catalogs).
    Falls back to standard pagination if search_after not supported.
    """
    headers = {"Authorization": f"Bearer {token}"}
    params = {"limit": 100, "with_count": "true", "pagination_type": "search_after"}

    url = f"{BASE_URL}/api/rest/v1/products"
    all_products = []
    total = None

    while url:
        r = requests.get(url, headers=headers, params=params)
        r.raise_for_status()
        data = r.json()

        if total is None:
            total = data.get("items_count")
            print(f"Total products: {total}")

        items = data.get("_embedded", {}).get("items", [])
        all_products.extend(items)
        print(f"  Fetched {len(all_products)}/{total}")

        # Get next page URL
        next_link = data.get("_links", {}).get("next", {}).get("href")
        url = next_link
        params = {}  # Params are included in the next URL

    return all_products

products = fetch_all_products(TOKEN)
Tip: Use pagination_type=search_after instead of page numbers for catalogs over 1000 products. Page-number pagination becomes slow beyond page 10 — search_after uses a cursor and is O(1) per page.

4. Understanding the product data structure

Each product returned by the API has this structure — attributes are nested under values, with locale and scope variants:

{
  "identifier": "TSHIRT-BLUE-M",
  "family": "clothing",
  "parent": "TSHIRT-BLUE",  // null for standalone products
  "categories": ["mens", "shirts"],
  "enabled": true,
  "created": "2024-01-15T10:23:00+00:00",
  "updated": "2025-02-10T14:35:22+00:00",
  "values": {
    // Localizable + scopable attribute:
    "description": [
      {
        "locale": "en_US",
        "scope": "ecommerce",
        "data": "Classic cotton t-shirt, perfect for..."
      },
      {
        "locale": "fr_FR",
        "scope": "ecommerce",
        "data": "T-shirt classique en coton..."
      }
    ],
    // Non-localizable, non-scopable:
    "ean": [
      { "locale": null, "scope": null, "data": "1234567890123" }
    ],
    // Price collection:
    "price": [
      {
        "locale": null,
        "scope": null,
        "data": [
          { "amount": "29.99", "currency": "EUR" },
          { "amount": "32.00", "currency": "USD" }
        ]
      }
    ],
    // Select attribute (value is option code, not label):
    "size": [
      { "locale": null, "scope": null, "data": "size_M" }
    ]
  }
}
Key gotcha: Select attribute values are option codes, not labels. size_M is just a code — to get "Medium (M)" you need to fetch /attributes/size/options/size_M.

5. Fetching families and attributes

Families define which attributes belong to which products. You need them to know what data to expect:

# Fetch all families
r = requests.get(f"{BASE_URL}/api/rest/v1/families",
                  headers={"Authorization": f"Bearer {TOKEN}"},
                  params={"limit": 100})
families = r.json()["_embedded"]["items"]

# Example family structure:
# {
#   "code": "clothing",
#   "labels": { "en_US": "Clothing", "fr_FR": "Vêtements" },
#   "attributes": ["sku", "name", "description", "color", "size", "price"],
#   "attribute_as_label": "name",
#   "attribute_requirements": {
#     "ecommerce": ["sku", "name", "description"],
#     "print": ["sku", "name"]
#   }
# }

# Fetch attribute details (type, labels, etc.)
r = requests.get(f"{BASE_URL}/api/rest/v1/attributes",
                  headers={"Authorization": f"Bearer {TOKEN}"},
                  params={"limit": 100})
attributes = {a["code"]: a for a in r.json()["_embedded"]["items"]}

# Attribute types include:
# pim_catalog_text, pim_catalog_textarea, pim_catalog_number,
# pim_catalog_boolean, pim_catalog_date, pim_catalog_price_collection,
# pim_catalog_metric, pim_catalog_simpleselect, pim_catalog_multiselect,
# pim_catalog_image, pim_catalog_file, pim_reference_data_simpleselect

6. Handling product models (variants)

This is the most complex part of the Akeneo API. Simple products with a parent field are variant products — their parent attributes are NOT included in the /products response. You must fetch them separately and merge:

# Step 1: Fetch all product models
def fetch_product_models(token):
    headers = {"Authorization": f"Bearer {token}"}
    url = f"{BASE_URL}/api/rest/v1/product-models"
    models = {}
    while url:
        r = requests.get(url, headers=headers, params={"limit": 100})
        data = r.json()
        for m in data["_embedded"]["items"]:
            models[m["code"]] = m
        url = data.get("_links", {}).get("next", {}).get("href")
    return models

models = fetch_product_models(TOKEN)

# Step 2: Flatten a product (merge parent attributes)
def flatten_product(product, models):
    if not product.get("parent"):
        return product  # standalone product, no merging needed

    flattened_values = {}

    # Walk up the parent chain
    parent_code = product["parent"]
    while parent_code:
        parent = models.get(parent_code, {})
        parent_values = parent.get("values", {})
        # Merge: only add if not already set (child overrides parent)
        for attr_code, attr_values in parent_values.items():
            if attr_code not in flattened_values:
                flattened_values[attr_code] = attr_values
        parent_code = parent.get("parent")  # go up one more level

    # Product's own values take priority
    flattened_values.update(product.get("values", {}))
    product["values"] = flattened_values
    return product

# Apply to all products
flattened = [flatten_product(p, models) for p in products]

Full explanation: Flattening Akeneo Product Models

7. Categories and reference entities

# Fetch category tree
r = requests.get(f"{BASE_URL}/api/rest/v1/categories",
                  headers={"Authorization": f"Bearer {TOKEN}"},
                  params={"limit": 100})
categories = {c["code"]: c for c in r.json()["_embedded"]["items"]}

# Category structure:
# {
#   "code": "shirts",
#   "parent": "mens",
#   "labels": { "en_US": "Shirts", "fr_FR": "Chemises" }
# }

# Build full path for a product's categories:
def get_category_path(cat_code, categories):
    parts = []
    code = cat_code
    while code:
        cat = categories.get(code, {})
        label = cat.get("labels", {}).get("en_US", code)
        parts.insert(0, label)
        code = cat.get("parent")
    return " > ".join(parts)

# get_category_path("shirts", categories) → "Men's > Shirts"

# Reference entities (e.g., brand, color)
r = requests.get(
    f"{BASE_URL}/api/rest/v1/reference-entities/brand/records",
    headers={"Authorization": f"Bearer {TOKEN}"},
    params={"limit": 100}
)
brand_records = {rec["code"]: rec for rec in r.json()["_embedded"]["items"]}

8. Exporting a full catalog to JSON

Putting it all together — a complete export to a JSON file:

import json

def export_catalog(output_file="catalog.json"):
    token = get_token()

    print("Fetching product models...")
    models = fetch_product_models(token)

    print("Fetching products...")
    products = fetch_all_products(token)

    print("Flattening product models...")
    flattened = [flatten_product(p, models) for p in products]

    print(f"Writing {len(flattened)} products to {output_file}")
    with open(output_file, "w") as f:
        json.dump(flattened, f, indent=2, default=str)

    print("Done!")

export_catalog()
# → catalog.json with all products, attributes merged from parent models
What this script is missing: error handling, token refresh for exports over 1 hour, incremental sync (only export changed products), direct database write, and retries for transient 5xx errors. This is why most teams reach for a connector after building their first export script.

9. When to use the API vs a connector

Use the API directly when:

  • One-time data migration or exploration
  • You need a custom destination not supported by connectors
  • Your catalog has very specific transformation logic
  • You're building a custom application that reads Akeneo data

Use a connector when:

  • You need scheduled, recurring exports
  • You want incremental sync (only changed products)
  • You need automatic product model flattening
  • You want exports to PostgreSQL, MongoDB, or MySQL without writing code

Export Akeneo without building the pipeline

SyncPIM handles authentication, pagination, flattening, and incremental sync — so you don't have to.

Related