Document Metadata vs Export JSON | data mismatch

[Deleted User] · January 2023

Hey! Long time no question,

I have noticed a possible inconsistency and I just wanted to make sure that I am understanding everything correctly.

The Problem:

The JSON data file Grooper.DocumentData.json attached to a Batch Folder (Grooper.Core.BatchFolder) differs from the data in the json file you get from the Export batch process (Export > Export Behavior > Export Definitions > Export Formats > JSON Metadata)

I noticed the description of JSON Metadata export format says "Exports Document metadata in JSON format." but when I use the API route api/v1/BatchProcessing/Folders/{FolderId}/Metadata and pass the folder id, all I get are key value pairs and none of the other import information like location and page.

Is there a way to get the full json data without having to export it to a file?

Examples:

Grooper.DocumentData.json

// Grooper.DocumentData.json
{
  "Ch": [
    {
      // Collapsed this for readability
    }
  ],
  "IxId": "bf240145-ab60-42ee-a9d6-987f50029a1e",
  "Name": "Data Model",
  "Loc": {
    "X1": 0.26,
    "Y1": 0.405,
    "X2": 8.059999,
    "Y2": 13.405
  },
  "FC": 1
}

JSON From Export

// JSON From Export
{
  "Ch": [
    {
      // Collapsed this for readability
    }
  ],
  "ColumnHeaders": null,
  "RowHeaders": null,
  "InstNo": 0,
  "IxId": "7388f24f-5097-4227-9213-cde94d52edc4",
  "Idx": 0,
  "Len": 0,
  "Page": 0,
  "Ang": 0,
  "Flags": 0,
  "Val": null,
  "Spans": null,
  "Name": "Data Model",
  "Loc": {
    "ptA": null,
    "ptB": null,
    "X1": 1.205,
    "Y1": 0.45,
    "X2": 7.19,
    "Y2": 10.47
  },
  "FC": 0.806176
}

/api/v1/batchprocessing/folders/{FolderID}/metadata

GET /api/v1/batchprocessing/folders/a3f611c3-b682-4642-98ef-48f53891e05e/metadata
{
  "ContentTypeId": "93e86d7e-d065-4f5f-8ab8-dbbc2af67e80",
  "FieldValues": [
    {
      "Key": "property_state",
      "Value": "WA"
    }
  ]
}

dearner · January 2023

You are correct that the /Metadata endpoint only returns the metadata key-value pairs.

You should, however, be able to get any file associated with a folder using the /BatchProcessing/Files endpoint (this includes the Grooper.DocumentData.Json). The endpoint you'll want to hit is /BatchProcess/Files/{Folder GUID}/{Filename}.

So if you can get at the GUIDs of the documents you're trying to get the full JSON metadata for, you should be able to pull that Grooper.DocumentData.Json using that endpoint.

[Deleted User] · January 2023

Yeah, I knew I could fetch any file associated with a folder. The problem is that the file on the node does not include page.

Json file from export: "Page": 3

That information doesn't appear in the file on the folder node itself

dearner · January 2023

I see what you're getting at. What do the location values look like for a field that's extracted from the second page?

The export JSON clearly has some additional information added at export time, but I'm wondering if you can extrapolate that from the information available from the attachment.

[Deleted User] · February 2023

I apologize for such a late reply @dearner . Holidays + vacations + forgetting to respond.

So, I did what you asked and the Grooper.DocumentData.Json file does indeed have Page in it. But what I found out is that the number Page has is actually 1 number behind what it is. For instance we have a data point called "mixed_use_property" and it is on page 3 in this document. But the Page key says it's 2. Every other data point in that file attached to the folder is the same way. Page doesn't show up if the number is 0 but it appears that the page numbers are an index starting at 0. So we'll never get the page number for page 1 data points

Document Metadata vs Export JSON | data mismatch

Comments