Published at

Automating Local Business Schema Creation with AI and Python

Learn how to use AI and Python to generate local schema for a large number of locations.

Authors
  • avatar
    Name
    AJ Dichmann
    Twitter
  • VP of Digital Strategy at Globe Runner
Table of Contents

In this guide I’ll walk you through how I used AI and Python to generate Local Business JSON-LD Schema for one of my enterprise clients with more than 100 franchise locations.

By using tools like Screaming Frog, Python scripting, and AI-assisted code creation, we can simplify what would otherwise be a complex and time-consuming process to create then upload the schema to each location page.

The Challenge

The client had a WordPress website with a page for each of their franchise locations. They wanted to add local business schema to each location page to improve their search engine visibility and provide more information to users.

Unfortunately, the client’s website was built with a dated page builder and had multiple page templates for the location pages. This made it difficult to extract the location data in a structured format that would allow for easy insertion of schema markup.

This meant that I would have to manually add the schema to each location page, which would take a significant amount of time.

The Solution

I decided to use Screaming Frog, AI and Python to automate the process of generating and uploading the schema.

Step 1: Extracting Location Data with Screaming Frog

The first step was to gather all current business location information from the client’s website.

To do this I used Screaming Frog’s SEO Spider to extract the location URLs, local business name, address, phone number, and other relevant metadata used for Local Business Schema.

To prepare this data for a later bulk upload I also extracted the WordPress Post ID for each location.

The end result was a CSV file with the following columns:

  • Location URL
  • Local Business Name
  • Address
  • Phone Number
  • WordPress Post ID
  • Additional metadata for the schema like hours of operation

Step 2: Using AI to Generate a Python Script

Next, I used Cursor AI to help generate a Python script that automates the schema generation process. I started with an example schema for one location, then prompted the AI to:

  1. Take a CSV file (locations.csv) as input.
  2. Append a new schema column to each row.
  3. Format the schema data based on the Local Business Schema example.

By iterating through a few prompts, I was able to generate a fully functional script that dynamically builds schema markup for all locations.

The end results was a Python script that looks like this (with client info removed):

import csv
import json
import os


def create_schema(row):
    """Create a schema.org JSON-LD structure for a clinic based on row data."""

    # Extract values from the row (assuming these columns exist)
    # You'll need to adjust these based on your actual CSV structure
    name = row.get("name", "")
    url = row.get("website", "")
    image = row.get(
        "Image",
        "https://clientdomain.com/image.jpg",
    )
    logo = row.get(
        "Logo",
        "https://clientdomain.com/image.jpg",
    )
    phone = row.get("phone", "")
    description = row.get("description", "")

    # Try different possible column names for address fields
    street = row.get("Street Address", row.get("StreetAddress", row.get("address", "")))
    city = row.get("city", row.get("Locality", ""))
    state = row.get("state", row.get("Region", ""))
    zip_code = row.get(
        "Zip", row.get("postcode", row.get("postcode", row.get("postcode", "")))
    )

    # Print the row keys to help debug
    print(f"Available columns in CSV: {list(row.keys())}")
    print(
        f"Address values found - Street: '{street}', City: '{city}', State: '{state}', Zip: '{zip_code}'"
    )

    # Format phone number with brackets
    formatted_phone = f"{phone}" if phone else ""

    # Create the schema structure
    schema = {
        "@context": "https://schema.org",
        "@type": "MedicalClinic",
        "medicalSpecialty": "Musculoskeletal",
        "name": name,
        "url": url,
        "image": image,
        "logo": logo,
        "telephone": formatted_phone,
        "priceRange": "$$",
        "description": description,
        "knowsAbout": [
            "orthopedics",
            "sports medicine",
            "Pain management",
            "Pain control",
            "Regenerative medicine",
        ],
        "address": {
            "@type": "PostalAddress",
            "streetAddress": street,
            "addressLocality": city,
            "addressRegion": state,
            "postalCode": zip_code,
            "addressCountry": "US",
        },
        "openingHoursSpecification": [
            {
                "@type": "OpeningHoursSpecification",
                "dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
                "opens": "08:00",
                "closes": "21:00",
            },
            {
                "@type": "OpeningHoursSpecification",
                "dayOfWeek": "Saturday",
                "opens": "09:00",
                "closes": "20:00",
            },
            {
                "@type": "OpeningHoursSpecification",
                "dayOfWeek": "Sunday",
                "opens": "09:00",
                "closes": "17:00",
            },
        ],
        "contactPoint": {
            "@type": "ContactPoint",
            "contactType": "customer support",
            "telephone": formatted_phone,
            "availableLanguage": ["English", "Spanish", "ASL"],
            "email": "[email protected]",
        },
        "sameAs": [
            f"https://www.facebook.com/",
            f"https://www.instagram.com",
            f"https://www.trustindex.io/",
        ],
    }

    # Format as a script tag with JSON-LD
    script_tag = f"""<script type="application/ld+json">
    {json.dumps(schema, indent=4)}
  </script>"""

    return script_tag


def process_csv():
    """Process the locations.csv file and add schema to each row."""
    input_file = "locations.csv"
    output_file = "locations_with_schema.csv"

    if not os.path.exists(input_file):
        print(f"Error: {input_file} not found.")
        return

    # Read the CSV file
    with open(input_file, "r", newline="", encoding="utf-8") as csvfile:
        # Print the raw content of the first few lines to see the format
        print("First 5 lines of the CSV file:")
        for i, line in enumerate(csvfile):
            if i < 5:
                print(f"Line {i+1}: {line.strip()}")
            else:
                break
        csvfile.seek(0)  # Reset file pointer to beginning

        # Try to read as CSV
        reader = csv.DictReader(csvfile)
        print(f"CSV headers detected: {reader.fieldnames}")

        fieldnames = reader.fieldnames
        if not fieldnames:
            print("ERROR: No fieldnames detected in CSV. Check file format.")
            return

        # Make sure we have a Schema column
        if "Schema" not in fieldnames:
            fieldnames.append("Schema")

        # Process each row and write to a new file
        with open(output_file, "w", newline="", encoding="utf-8") as outfile:
            writer = csv.DictWriter(outfile, fieldnames=fieldnames)
            writer.writeheader()

            row_count = 0
            for row in reader:
                row_count += 1
                print(f"\nProcessing row {row_count}:")
                print(f"Row data: {row}")

                # Generate schema for this location
                schema = create_schema(row)

                # Add schema to the row
                row["Schema"] = schema

                # Write the updated row
                writer.writerow(row)

                # Print a sample of the schema (first 100 chars)
                schema_preview = schema[:100] + "..." if len(schema) > 100 else schema
                print(f"Schema generated (preview): {schema_preview}")

    print(
        f"\nProcessing complete. Processed {row_count} rows. Output saved to {output_file}"
    )


if __name__ == "__main__":
    process_csv()

Step 3: Running the Python Script

Once the script was ready, I ran it using:

python generate_schema.py

This processed the CSV file and added the schema markup to each row. The advantage of using Python over a traditional concatenate function in Excel is that it makes the process much more maintainable, especially when changes or updates are needed.

Step 4: Uploading the Schema to Wordpress

With our newly generated CSV the final step was to upload the schema to each location page.

To do this I used WP All Import, a plugin that allows you to import data from a CSV file to a new custom field created with ACF called local_schema.

  1. Create a new custom field called local_schema in ACF.
  2. Create a new import profile in WP All Import and map the schema column to the local_schema custom field, using the WordPress Post ID as the reference.
  3. Import the CSV file.

Now the schema is uploaded to each location page.

Step 5: Adding Schema to the Frontend

Now that the schema is uploaded to each location page custom field, we need to add the schema to the frontend of the page.

To do this I used a WP function to add a PHP snippet to the head of the page. The function checks if the current page is a location page and if the local_schema custom field is not empty. If both conditions are met, the function adds the schema to the head.

function add_local_schema () {
    if (is_page() && function_exists('get_field')) {
        $local_schema = get_field('local_schema');
        if (!empty($local_schema)) {
            echo $local_schema;
        }
    }
}
add_action( 'wp_head', 'add_local_schema' );

Alternative Approaches

I did consider a few alternative approaches to this problem.

  • Schema plugin like Rankmath - This would have been a much simpler solution, but the client was using a dated page builder that didn’t allow for easy insertion of schema markup.
  • Use Google Sheets and a concatenate function - This would have been a more manual process and concatenating 5+ fields that use schema markup would have been a nightmare.
  • Use AI to generate the schema inside the LLM interface - This would have been a an effetive approach, but prone to hallucinations and difficult to maintain.
  • Use AI to generate the schema inside Google Sheets - Google Sheets AI plugins are great, but will struggle with the same problems as concatenating and LLM interface approaches.

Conclusion

This method showcases how AI and Python can work together to automate repetitive tasks, making schema generation faster and more scalable. Instead of manually constructing schema markup, we can generate it programmatically with just a click of a button.

This guide should give you some ideas on how to leverage AI and Python for similar tasks in your own projects.

Sharing is caring!