Grooper 21.00.0082 is available as of 12-12-2023! Check the  Downloads Discussion  for the release notes and to get the latest version.
Grooper 23.00.0044 is available as of 06-20-2024! Check the Downloads Discussion for the release notes and to get the latest version.
Grooper 23.1.0024 is available as of 09-03-2024! Check the  Downloads Discussion  for the release notes and to get the latest version.

Document Metadata vs Export JSON | data mismatch

[Deleted User][Deleted User] Posts: 0 ✭✭
edited January 2023 in The Astronauts (Q&A)
Hey! Long time no question,

I have noticed a possible inconsistency and I just wanted to make sure that I am understanding everything correctly.

The Problem:

The JSON data file Grooper.DocumentData.json attached to a Batch Folder (Grooper.Core.BatchFolder) differs from the data in the json file you get from the Export batch process (Export > Export Behavior > Export Definitions > Export Formats > JSON Metadata)

I noticed the description of JSON Metadata export format says "Exports Document metadata in JSON format." but when I use the API route api/v1/BatchProcessing/Folders/{FolderId}/Metadata and pass the folder id, all I get are key value pairs and none of the other import information like location and page.

Is there a way to get the full json data without having to export it to a file?

Examples:

Grooper.DocumentData.json
// Grooper.DocumentData.json
{ "Ch": [ { // Collapsed this for readability } ], "IxId": "bf240145-ab60-42ee-a9d6-987f50029a1e", "Name": "Data Model", "Loc": { "X1": 0.26, "Y1": 0.405, "X2": 8.059999, "Y2": 13.405 }, "FC": 1 }
JSON From Export
// JSON From Export
{ "Ch": [ { // Collapsed this for readability } ], "ColumnHeaders": null, "RowHeaders": null, "InstNo": 0, "IxId": "7388f24f-5097-4227-9213-cde94d52edc4", "Idx": 0, "Len": 0, "Page": 0, "Ang": 0, "Flags": 0, "Val": null, "Spans": null, "Name": "Data Model", "Loc": { "ptA": null, "ptB": null, "X1": 1.205, "Y1": 0.45, "X2": 7.19, "Y2": 10.47 }, "FC": 0.806176 }
/api/v1/batchprocessing/folders/{FolderID}/metadata
GET /api/v1/batchprocessing/folders/a3f611c3-b682-4642-98ef-48f53891e05e/metadata
{
  "ContentTypeId": "93e86d7e-d065-4f5f-8ab8-dbbc2af67e80",
  "FieldValues": [
    {
      "Key": "property_state",
      "Value": "WA"
    }
  ]
}


Comments

  • dearnerdearner Posts: 286 ✭✭✭
    You are correct that the /Metadata endpoint only returns the metadata key-value pairs. 

    You should, however, be able to get any file associated with a folder using the /BatchProcessing/Files endpoint (this includes the Grooper.DocumentData.Json).  The endpoint you'll want to hit is /BatchProcess/Files/{Folder GUID}/{Filename}. 

    So if you can get at the GUIDs of the documents you're trying to get the full JSON metadata for, you should be able to pull that Grooper.DocumentData.Json using that endpoint.
  • [Deleted User][Deleted User] Posts: 0 ✭✭
    Yeah, I knew I could fetch any file associated with a folder. The problem is that the file on the node does not include page.

    Json file from export: "Page": 3

    That information doesn't appear in the file on the folder node itself
  • dearnerdearner Posts: 286 ✭✭✭
    I see what you're getting at.  What do the location values look like for a field that's extracted from the second page? 

     The export JSON clearly has some additional information added at export time, but I'm wondering if you can extrapolate that from the information available from the attachment.
  • [Deleted User][Deleted User] Posts: 0 ✭✭
    I apologize for such a late reply @dearner . Holidays + vacations + forgetting to respond.

    So, I did what you asked and the Grooper.DocumentData.Json file does indeed have Page in it. But what I found out is that the number Page has is actually 1 number behind what it is. For instance we have a data point called "mixed_use_property" and it is on page 3 in this document. But the Page key says it's 2. Every other data point in that file attached to the folder is the same way. Page doesn't show up if the number is 0 but it appears that the page numbers are an index starting at 0. So we'll never get the page number for page 1 data points
Sign In or Register to comment.