If you’re moving vCenter Server from 8.0 U3x to 9.1.0.0 — typically as part of a VCF 9.1 deployment or a VVF 9.1 upgrade — the run can fail partway through with:

Error attempting Upgrade Database Schema
Please check vcintegrity migration logs for details.

This is one of those failures that looks scary in the UI but has a clean, well-understood root cause: a handful of orphaned rows in the vcIntegrity database that trip a new foreign-key constraint introduced in 9.1. Below is the whole story — what the error means, how to read the logs, why it happens, and how to clear it so the upgrade completes. It’s the same shape of problem I wrote about in fixing PostgreSQL on Fleet Management 9.0.1: a schema change colliding with pre-existing data.

This is documented in Broadcom KB 442223. The supported fix uses a script attached to that KB. I’ll walk through the official steps first, then share an unofficial helper of my own for the curious — clearly marked, and not a replacement for the KB script.


The symptom

The upgrade proceeds normally until the vcIntegrity import phase, then halts with the generic schema error shown above. The UI doesn’t tell you what in the schema failed — it just points you at the vcIntegrity migration logs. That’s your cue to SSH in and read the actual exception.


Reading the upgrade log

On the source vCenter, open the vcIntegrity import log:

/var/log/vmware/upgrade/Import_com.vmware.vcIntegrity_####_##_##_##_##.log

The meaningful lines look like this (timestamps and IDs trimmed):

DEBUG vmware-vci-vciInstallUtils [VdbODBCError::VdbODBCError] SQLGetDiagRec
  SQLState='23503' NativeError=1
  MessageText='ERROR: insert or update on table "pm_recommendation_spec"
  violates foreign key constraint "fk_pm_recommendation_spec"
  DETAIL: Key (entity_id, spec_id)=(domain-##, -1000) is not present in
  table "pm_recommendation_info".'

ERROR vmware-vci-vciInstallUtils [installerRunDBCommand 626] [DATABASE]
  Vdb::RunScript, SQL Exec Error: "ODBC error: (23503) - ERROR: insert or
  update on table "pm_recommendation_spec" violates foreign key constraint
  "fk_pm_recommendation_spec" ... " is returned when executing SQL statement
  "ALTER TABLE IF EXISTS PM_RECOMMENDATION_SPEC
       ADD CONSTRAINT FK_PM_RECOMMENDATION_SPEC
           FOREIGN KEY (ENTITY_ID, SPEC_ID)
           REFERENCES PM_RECOMMENDATION_INFO (ENTITY_ID, SPEC_ID)
           ON DELETE CASCADE"

ERROR __main__ Upgrade Phase 'vcIntegrity:Import' failed.
  Exception: Error attempting Upgrade Database Schema
upgrade_errors.PermanentError: Error attempting Upgrade Database Schema

Three things are worth pulling out of that wall of text:

The SQLState is 23503, which in PostgreSQL is foreign_key_violation — not a connectivity, permissions, or disk problem. The database is doing exactly what it was told to do and refusing bad data.

The failing statement is an ALTER TABLE ... ADD CONSTRAINT, not an insert from your workload. The migration is trying to add the new constraint FK_PM_RECOMMENDATION_SPEC to an existing table, and Postgres validates the rows already in that table before it will accept the constraint.

The offending key is (entity_id, spec_id) = (domain-##, -1000). There’s a row in pm_recommendation_spec whose (entity_id, spec_id) pair has no matching parent row in pm_recommendation_info. The -1000 is a sentinel/placeholder value, and domain-## is a managed-object reference — classic leftovers from objects that were removed or never fully written.


Root cause

The pm_recommendation_* tables live in the vcIntegrity database — the component behind vSphere Lifecycle Manager / Update Planner. They store lifecycle “recommendation” data: pm_recommendation_info holds the parent records, and pm_recommendation_spec holds child spec rows that are supposed to reference a parent.

Over the life of an 8.0 environment, it’s possible to end up with child spec rows whose parent info row is gone (or was never created) — for example after domains/clusters are removed, or from older code paths that didn’t clean up specs. In 8.0 there was no enforced relationship between the two tables, so these orphans sat there harmlessly.

vCenter 9.1.0.0 hardens the schema. The migration adds a composite foreign key:

ALTER TABLE PM_RECOMMENDATION_SPEC
    ADD CONSTRAINT FK_PM_RECOMMENDATION_SPEC
        FOREIGN KEY (ENTITY_ID, SPEC_ID)
        REFERENCES PM_RECOMMENDATION_INFO (ENTITY_ID, SPEC_ID)
        ON DELETE CASCADE;

When PostgreSQL adds a foreign key, it validates every existing row in the child table against the parent table. Any orphan — a (entity_id, spec_id) pair with no parent — violates the new constraint, so the engine aborts the ALTER TABLE with error 23503. Because the migration treats this as a PermanentError, the whole vcIntegrity import phase fails and the upgrade stops.

In short: 9.1 is correct to demand referential integrity; your 8.0 data just doesn’t satisfy it yet. The fix is to remove the orphaned spec rows so the constraint can be created cleanly.

Broadcom notes that a fix to prevent this will be included in a future VCF release. Until then, the workaround below is the path forward.


The official fix (KB 442223 workaround)

Do this on the source vCenter appliance, before retrying the upgrade.

1. Snapshot the vCenter appliance

Take a snapshot of the source vCenter VM first. Editing the embedded database is low-risk when done correctly, but a snapshot is your guaranteed rollback. (If your environment enforces a snapshot-age policy, see changing the old-snapshot restriction — just remember to remove this snapshot once the upgrade succeeds.)

2. Download the cleanup script from the KB

Download cleanup_rec_orphans.py from the Attachments section of KB 442223. This is the supported script — use it for production work.

3. Copy the script to the appliance

Place it in /tmp on the vCenter server. You can use an SCP client such as WinSCP. If WinSCP fails with a “Received too large … SFTP packet” error, follow KB 326317 (set the shell to bash for the session, or use the SCP fallback).

4. SSH in as root

Log in to the vCenter appliance over SSH with root. Enable the shell if needed (shell from the appliancesh prompt).

5. Dry-run audit (read-only)

Run the script with no flags first. It reports orphans without changing anything:

su updatemgr -s /bin/bash -c "python3 /tmp/cleanup_rec_orphans.py"

The su updatemgr matters: the updatemgr account owns the vcIntegrity database connection, so the script authenticates correctly without you handling credentials.

6. Execute the cleanup

Once you’ve reviewed the dry-run output, run it again with --execute to delete the orphaned rows:

su updatemgr -s /bin/bash -c "python3 /tmp/cleanup_rec_orphans.py --execute"

7. Retry the upgrade

Re-run the vCenter upgrade. With the orphans gone, the ALTER TABLE ... ADD CONSTRAINT validates cleanly and the vcIntegrity import phase completes.


What the script actually does (the SQL behind it)

You don’t need to know this to run the KB script, but it helps to understand why it’s safe. The job is purely to find child rows with no parent and remove them. The detection is a straightforward anti-join:

-- Orphans: spec rows whose (entity_id, spec_id) has no parent info row
SELECT s.entity_id, s.spec_id
FROM pm_recommendation_spec s
LEFT JOIN pm_recommendation_info i
       ON i.entity_id = s.entity_id
      AND i.spec_id   = s.spec_id
WHERE i.entity_id IS NULL
ORDER BY s.entity_id, s.spec_id;

And the cleanup is the matching delete:

DELETE FROM pm_recommendation_spec s
WHERE NOT EXISTS (
    SELECT 1
    FROM pm_recommendation_info i
    WHERE i.entity_id = s.entity_id
      AND i.spec_id   = s.spec_id
);

That’s it. You’re deleting only spec rows that already point at nothing — exactly the rows the new constraint would reject. No valid recommendation data is touched.


Optional: an unofficial audit/cleanup helper

Use the KB script for anything you care about. The Bash helper below is a community convenience I wrote to mirror the same logic for quick auditing in labs. It is unofficial and unsupported, runs read-only by default, and should only be pointed at a vCenter you’ve snapshotted. If in doubt, stop and use cleanup_rec_orphans.py from the KB.

It does three things: locate the database that actually owns the table (so it doesn’t matter what the DB is named in your build), report the orphan count and the offending pairs, and — only with --execute — delete them inside a transaction.

#!/usr/bin/env bash
# audit_rec_orphans.sh  —  UNOFFICIAL community helper (KB 442223)
#
# Mirrors the intent of Broadcom's cleanup_rec_orphans.py: find/remove
# orphaned rows in pm_recommendation_spec that have no parent in
# pm_recommendation_info and therefore block the FK_PM_RECOMMENDATION_SPEC
# constraint added by the vCenter 9.1.0.0 vcIntegrity schema upgrade.
#
#   DRY-RUN by default.  Pass --execute to delete.
#   ALWAYS snapshot the vCenter appliance first.
#   Prefer the official KB script for supported/production environments.
#
# Run it the same way the KB runs its script (as the vcIntegrity DB owner):
#   su updatemgr -s /bin/bash -c "/tmp/audit_rec_orphans.sh"            # audit
#   su updatemgr -s /bin/bash -c "/tmp/audit_rec_orphans.sh --execute"  # delete
set -euo pipefail

EXECUTE=0
[ "${1:-}" = "--execute" ] && EXECUTE=1

if [ "$(id -un)" != "updatemgr" ]; then
  echo "[!] Not running as 'updatemgr' — DB auth will likely fail."
  echo "    Re-run:  su updatemgr -s /bin/bash -c \"$0 ${1:-}\""
  exit 1
fi

# Prefer the embedded vPostgres client if present
PSQL="/opt/vmware/vpostgres/current/bin/psql"
[ -x "$PSQL" ] || PSQL="$(command -v psql)"

# Discover which database owns pm_recommendation_spec (override with VCI_DB=...)
VCI_DB="${VCI_DB:-}"
if [ -z "$VCI_DB" ]; then
  for db in $("$PSQL" -Atqc \
        "SELECT datname FROM pg_database WHERE datistemplate=false" 2>/dev/null); do
    if [ -n "$("$PSQL" -Atqd "$db" \
          -c "SELECT to_regclass('pm_recommendation_spec')" 2>/dev/null)" ]; then
      VCI_DB="$db"; break
    fi
  done
fi
[ -n "$VCI_DB" ] || { echo "[x] Could not locate the vcIntegrity DB (set VCI_DB=...)"; exit 1; }
echo "[*] Using database: $VCI_DB"

ORPHAN_FILTER="FROM pm_recommendation_spec s
  LEFT JOIN pm_recommendation_info i
         ON i.entity_id = s.entity_id AND i.spec_id = s.spec_id
  WHERE i.entity_id IS NULL"

COUNT="$("$PSQL" -Atqd "$VCI_DB" -c "SELECT count(*) $ORPHAN_FILTER")"
echo "[*] Orphaned pm_recommendation_spec rows: $COUNT"

if [ "$COUNT" = "0" ]; then
  echo "[+] No orphans — FK_PM_RECOMMENDATION_SPEC will validate cleanly."
  exit 0
fi

echo "[*] Offending (entity_id, spec_id) pairs:"
"$PSQL" -d "$VCI_DB" -c "SELECT s.entity_id, s.spec_id $ORPHAN_FILTER
                         ORDER BY s.entity_id, s.spec_id"

if [ "$EXECUTE" -ne 1 ]; then
  echo
  echo "[i] DRY-RUN only. Re-run with --execute to delete the rows above."
  exit 0
fi

echo "[*] Deleting orphaned rows in a transaction…"
"$PSQL" -v ON_ERROR_STOP=1 -d "$VCI_DB" <<'SQL'
BEGIN;
DELETE FROM pm_recommendation_spec s
WHERE NOT EXISTS (
  SELECT 1 FROM pm_recommendation_info i
  WHERE i.entity_id = s.entity_id AND i.spec_id = s.spec_id
);
COMMIT;
SQL
echo "[+] Done. Re-run without --execute to confirm 0 orphans, then retry the upgrade."

Save it to /tmp/audit_rec_orphans.sh, then run the dry-run first and review the output before ever adding --execute. Like the KB script, it’s idempotent — running it again after a successful cleanup simply reports zero orphans.


Verifying the fix

Re-run the audit (dry-run) — it should now report 0 orphaned rows. That single number is the whole signal: zero orphans means the new foreign key has nothing left to reject.

Then retry the vCenter upgrade and confirm the vcIntegrity:Import phase passes. If you want belt-and-suspenders confirmation, tail the new import log and check that the ALTER TABLE ... ADD CONSTRAINT FK_PM_RECOMMENDATION_SPEC statement executes without a 23503 error:

grep -i "FK_PM_RECOMMENDATION_SPEC\|23503\|Upgrade Database Schema" \
  /var/log/vmware/upgrade/Import_com.vmware.vcIntegrity_*.log

No matches on 23503 (and a clean run past the schema step) means you’re through it.


Rollback

If anything looks wrong after the cleanup or the retried upgrade, revert to the snapshot you took in step 1 and reassess before trying again. The cleanup itself only removes already-orphaned rows, so a revert puts you exactly back where you started.

Once the upgrade has completed successfully and vCenter is healthy, delete the snapshot — leaving snapshots on a vCenter appliance long-term causes its own problems (disk growth, performance, and consolidation headaches).


Notes & gotchas

Why su updatemgr? That OS account maps to the database role that owns the vcIntegrity connection. Running the script under it means you never handle DB credentials by hand, and the connection has exactly the rights it needs.

Is deleting these rows safe? Yes — you’re only removing pm_recommendation_spec rows that already reference a non-existent parent. They’re dead lifecycle-recommendation data; no live configuration depends on them. Recommendations regenerate as needed after the upgrade.

Does it apply to VVF as well as VCF? The KB lists both VMware Cloud Foundation 9.1 and VMware vSphere Foundation 9.1 — the trigger is the vCenter 9.1.0.0 schema change, so it’s the same regardless of which edition you’re upgrading under.

Will it come back? Broadcom has stated a fix is planned for a future VCF release so the orphans don’t block the migration. Until then, audit-then-clean is the routine.


References